Prateek Yadav

Modular, Efficient, and Adaptive LLMs, MoE, Model Merging, RLHF, Continual Learning.

praty@cs.unc.edu

I am currently on the job market and actively interviewing. If you are interested in hiring someone working on topics like Reasoning/Alignment, Efficiency, MoE/Modular models, Synthetic Data, test-time compute or any other phase of pre-training / post-training then please reach out via email!

Hey! I am a PhD student at the MURGe-Lab at the University of North Carolina, Chapel Hill, where I am advised by Prof. Mohit Bansal. I also work with Colin Raffel. My research goal is to make deep learning models that can efficiently and continually adapt to new domains. For this goal, I work on topics like modular, efficient, and adaptive LLMs, mixtures of experts, model merging, RLHF, efficiency, and continuous learning.

Since May 2024, I have been working as a student researcher at Google Deepmind where I am working with Tsendsuren, Jonathan, Tu Vu, Alexandra. I will continue to work part-time at Deepmind until April 2025.

Previously, I have worked at Microsoft Research Redmond with Subhabrata Mukherjee, Ahmed H. Awadallah, and Amazon AWS AI Labs with Qing Sun. Before my PhD, I also worked at Microsoft Research India with Dr. Prateek Jain. I also worked full-time for a year with some amazing people at LinkedIn AI Bangalore. Before all this, I completed my undergraduate degree in pure mathematics in 2018 from IISc Bangalore, where I was supervised by Prof. Partha Talukdar.