Prateek Yadav

MoE, Model Merging, Continual Learning, Instruction Tuning, Parameter Efficiency.

praty@cs.unc.edu

Hey! I am a PhD student at the MURGe-Lab at the University of North Carolina - Chapel Hill, where I am advised by Prof. Mohit Bansal. I also frequently collaborate with Colin Raffel. My research goal is to make deep learning models learn continually and efficiently adapt to multiple domains. I am interested in efficient methods that lead to generalization and exploiting sparsity, memory, and mixture-of-expert models for continual learning. I am interested in Instruction tuning as it makes model adaptation easier.

Previously, I have worked on a diverse set of topics – 1) Interpretability, 2) compositional reasoning in NLP, 3) deep learning methods for Graph and Hypergraph structured data and their application to NLP, 4) estimated and controlled for uncertanity in the learned representations from these methods, and 5) worked on Bayesian modeling of temporal data.

Over the past few years, I have been fortunate to work with Arindam Mitra, Subhabrata Mukherjee, Ahmed H. Awadallah, and Guoqing Zheng at Microsoft Research Redmond, with Ming Tan, Qing Sun, Xiaopeng Li at Amazon AWS AI Labs, with Prof. Partha Talukdar at MALL-Lab at Indian Institute of Science (IISc) Bangalore, with Dr. Prateek Jain at Microsoft Research India and with Prof. Arun Rajkumar at Indian Institute of Technology, Madras. I also worked for a year with some amazing people at LinkedIn AI Bangalore. Before all this, I completed my undergraduate degree in pure mathematics in 2018 from IISc Bangalore where I was supervised by Prof. Partha Talukdar.