news

Nov 07, 2025 📢 Announcement: I’ll be attending NeurIPS 2025 in San Diego from December 2–8. Lets connect!
Nov 07, 2025 Check out our new preprint TwIST: Rigging the Lottery in Transformers with Independent Subnetwork Training! We propose a novel distributed training algorithm that trains subnetworks in parallel to uncover high-performing sparse models that need no fine-tuning.
Sep 19, 2025 Excited to announce that our paper Learning to Specialize: Joint Gating-Expert Training for Adaptive MoEs in Decentralized Settings has been accepted at NeurIPS2025