Publications

Google Scholar

2024

  1. Preprint
    Glider: Global and Local Instruction-Driven Expert Router
    Yadav, Prateek*, Li, Pingzhi*, Yoon, Jaehong, Peng, Jie, Sung, Yi-Lin, Bansal, Mohit, and Chen, Tianlong
    2024
  2. Preprint
    What Matters for Model Merging at Scale?
    Yadav, Prateek, Vu, Tu, Lai, Jonathan, Chronopoulou, Alexandra, Faruqui, Manaal, Bansal, Mohit, and Munkhdalai, Tsendsuren
    2024
  3. Preprint
    A Survey on Model MoErging: Recycling and Routing Among Specialized Experts for Collaborative Learning
    Yadav, Prateek, Raffel, Colin, Muqeeth, Mohammed, Caccia, Lucas, Liu, Haokun, Chen, Tianlong, Bansal, Mohit, Choshen, Leshem, and Sordoni, Alessandro
    2024
  4. Preprint
    BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
    Zhuo, Terry Yue, Vu, Minh Chien, Chim, Jenny, Hu, Han, Yu, Wenhao, Widyasari, Ratnadira, Yusuf, Imam Nur Bani, Zhan, Haolan, He, Junda, Paul, Indraneil, Brunner, Simon, Gong, Chen, Hoang, Thong, Zebaze, Armel Randy, Hong, Xiaoheng, Li, Wen-Ding, Kaddour, Jean, Xu, Ming, Zhang, Zhihan, Yadav, Prateek, Jain, Naman, Gu, Alex, Cheng, Zhoujun, Liu, Jiawei, Liu, Qian, Wang, Zijian, Lo, David, Hui, Binyuan, Muennighoff, Niklas, Fried, Daniel, Du, Xiaoning, Vries, Harm, and Werra, Leandro Von
    2024
  5. Preprint
    Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order
    Nakamura, Taishi, Mishra, Mayank, Tedeschi, Simone, Chai, Yekun, Stillerman, Jason T, Friedrich, Felix, Yadav, Prateek, Laud, Tanmay, Chien, Vu Minh, Zhuo, Terry Yue, Misra, Diganta, Bogin, Ben, Vu, Xuan-Son, Karpinska, Marzena, Dantuluri, Arnav Varma, Kusa, Wojciech, and Tommaso,
    2024
  6. Preprint
    ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization
    Yadav, Prateek, Choshen, Leshem, Raffel, Colin, and Bansal, Mohit
    2024
  7. ICLR [Spotlight]
    Merge, Then Compress: Demystify Efficient SMoE with Hints from Its Routing Policy
    Li, Pingzhi, Zhang, Zhenyu, Yadav, Prateek, Sung, Yi-Lin, Cheng, Yu, Bansal, Mohit, and Chen, Tianlong
    In International Conference on Learning Representations 2024
  8. ICLR
    D2 Pruning: Message Passing for Balancing Diversity & Difficulty in Data Pruning
    Maharana, Adyasha, Yadav, Prateek, and Bansal, Mohit
    In International Conference on Learning Representations 2024

2023

  1. NeurIPS’23
    Resolving Interference When Merging Models
    Yadav, Prateek, Tam, Derek, Choshen, Leshem, Raffel, Colin, and Bansal, Mohit
    In Neural Information Processing Systems 2023
  2. NeurIPS’23
    Self-Chained Image-Language Model for Video Localization and Question Answering
    Yu, Shoubin, Cho, Jaemin, Yadav, Prateek, and Bansal, Mohit
    In Neural Information Processing Systems 2023
  3. ACL’23
    Exploring Continual Learning for Code Generation Models
    Yadav, Prateek, Sun, Qing, Ding, Hantian, Li, Xiaopeng, Zhang, Dejiao, Tan, Ming, Xiaofei, Ma, Bhatia, Parminder, Nallapati, Ramesh, Ramanathan, Murali Krishna, Bansal, Mohit, and Bing, Xiang
    In Association for Computational Linguistics 2023
  4. ACL’23
    Exclusive Supermask Subnetwork Training for Continual Learning
    Yadav, Prateek, and Bansal, Mohit
    In Findings in Association for Computational Linguistics 2023

2022

  1. ACL’22
    Explanation Graph Generation via Pre-trained Language Models: An Empirical Study with Contrastive Learning
    Saha, Swarnadeep, Yadav, Prateek, and Bansal, Mohit
    In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022
  2. Preprint
    Low-Cost Algorithmic Recourse for Users With Uncertain Cost Functions
    Yadav, Prateek, Hase, Peter, and Bansal, Mohit
    2022

2021

  1. EMNLP’22 [ORAL]
    ExplaGraphs: An Explanation Graph Generation Task for Structured Commonsense Reasoning
    Saha, Swarnadeep, Yadav, Prateek, Bauer, Lisa, and Bansal, Mohit
    In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2021
  2. NAACL’22 [ORAL]
    multiPRover: Generating Multiple Proofs for Improved Interpretability in Rule Reasoning
    Saha, Swarnadeep, Yadav, Prateek, and Bansal, Mohit
    In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies 2021
  3. Preprint
    Discrete Time Latent Hawkes Processes for Modeling Multidimensional Temporal Event Streams
    Yadav, Prateek, Sankaran, Raman, Dutta, Partha, and Bhatt, Rushi
    Preprint 2021
  4. GCLR AAAI
    Rank Refinement: An Algorithmic framework with applications to diversity aware influence maximization
    Yadav, Prateek, and Rajkumar, Arun
    GCLR AAAI 2021

2020

  1. CIKM
    NHP: Neural Hypergraph Link Prediction
    Yadati, Naganand, Nitin, Vikram, Nimishakavi, Madhav, Yadav, Prateek, Louis, Anand, and Talukdar, Partha
    In Proceedings of the 29th ACM International Conference on Information & Knowledge Management 2020

2019

  1. NeurIPS
    HyperGCN: A New Method For Training Graph Convolutional Networks on Hypergraphs
    Yadati, Naganand, Nimishakavi, Madhav, Yadav, Prateek, Nitin, Vikram, Louis, Anand, and Talukdar, Partha
    In Advances in Neural Information Processing Systems 2019
  2. AISTATS
    Lovasz Convolutional Networks
    Yadav, Prateek, Nimishakavi, Madhav, Yadati, Naganand, Vashishth, Shikhar, Rajkumar, Arun, and Talukdar, Partha
    In Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics 2019
  3. ACL
    Incorporating Syntactic and Semantic Information in Word Embeddings using Graph Convolutional Networks
    Vashishth, Shikhar, Yadav, Prateek*, Bhandari, Manik*, Rai, Piyush, Bhattacharyya, Chiranjib, and Talukdar, Partha
    In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
  4. AISTATS
    Confidence-based Graph Convolutional Networks for Semi-Supervised Learning
    Vashishth, Shikhar*, Yadav, Prateek*, Bhandari, Manik, and Talukdar, Partha
    In Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics 2019