Announcement_1
Check out our new preprint TwIST: Rigging the Lottery in Transformers with Independent Subnetwork Training! We propose a novel distributed training algorithm that trains subnetworks in parallel to uncover high-performing sparse models that need no fine-tuning.