Below is the list of accepted papers:
Awards
- Best paper award: Kinetics: Rethinking Test-Time Scaling Laws (Ranajoy Sadhukhan, Zhuoming Chen, Haizhong Zheng, et al.)
 - Best poster award: Gatekeeper: Improving Model Cascades Through Confidence Tuning (Stephan Rabanser, Nathalie Rauschmayr, Achin Kulshrestha, Petra Poklukar, Wittawat Jitkrittum, Sean Augenstein, Congchao Wang, Federico Tombari)
 - Honorable mention: LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning (Nurbek Tastan, Stefanos Laskaridis, Martin Takac, Karthik Nandakumar, Samuel Horváth)
 
Orals
- Too Big to Think: Capacity, Memorization, and Generalization in Pre-Trained Transformers (Joshua Barron, Devin White)
 - Kinetics: Rethinking Test-Time Scaling Laws (Ranajoy Sadhukhan, Zhuoming Chen, Haizhong Zheng, et al.)
 - Preserve then Quantize: Dominant-Subspace Guided Low-Rank Reconstruction (Yoonjun Cho, Dongjae Jeon, Soeun Kim, Albert No)
 - WhisperKit: On-device Real-time ASR with Billion-Scale Transformers (Berkin Durmus, Arda Okan, Eduardo Pacheco, et al.)
 - Offloaded Reasoning: Efficient Inference for Large Language Models via Modular Reasoning and Refinement (Ishan Jindal, Jayant Taneja, Badrinath Chandana, et al.)
 - Enhancing Reasoning Capabilities of Small Language Models with Blueprints and Prompt Template Search (Dongge Han, Menglin Xia, Daniel Madrigal, et al.)
 
Posters
- Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation for Federated Learning (Grigory Malinovsky, Umberto Michieli, Hasan Hammoud, Taha Ceritli, Hayder Elesedy, Mete Ozay, Peter Richtarik)
 - Unlocking the Potential of Extremely Low-Bit Sparse Transformers through Adaptive Multi-bit Supermasks and Random Weights (Yasuyuki Okoshi, Hikari Otsuka, Junnosuke Suzuki, Daichi Fujiki, Masato Motomura)
 - Addition is almost all you need: Compressing neural networks with double binary factorization (Vladimír Boža, Vladimír Macko)
 - Capability Transfer from Large to Small Models with Synthetically-Generated Data (Lillian Sun, Emma Yang, Arif Dayi)
 - TensorSLM: Energy-efficient Embedding Compression of Sub-billion Parameter Language Models on Low-end Devices (Mingxue Xu, Yao Lei Xu, Danilo Mandic)
 - Leveraging Coordinate Momentum in SignSGD and Muon: Memory-Optimized Zero-Order LLM Fine-Tuning (Egor Petrov, Evseev Grigoriy, Aleksey Antonov, Andrey Veprikov, Pavel Plyusnin, Nikolay Bushkov, Stanislav Moiseev, Aleksandr Beznosikov)
 - Higher Acceptance Rates for Speculative Decoding with Randomised Drafting (William Toner, Martin Asenov, Rajkarn Singh, Artjom Joosen)
 - Zoop it! Efficient Zero-Order Optimization with Output Perturbation (Xixi Hu, Bo Liu, qiang liu, Xiaocong Du, Bhargav Bhushanam, Louis Feng, Chengyue Gong, Kaizhao Liang)
 - MatMuls are Enough for Efficient and Performant Linear-Time Attention (Andrew Argatkiny, Ilya Makarov)
 - Zeroth-Order Optimization is Secretly Single-Step Policy Optimization (Junbin Qiu, Zhengpeng Xie, Xiangda Yan, Yongjie Yang, Yao Shu)
 - Spec-LLaVA: Accelerating Vision-Language Models with Dynamic Tree-Based Speculative Decoding (Mingxiao Huo, Jiayi Zhang, Hewei Wang, Jinfeng Xu, Zheyu Chen, Huilin Tai, Ian Chen)
 - LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning (Nurbek Tastan, Stefanos Laskaridis, Martin Takac, Karthik Nandakumar, Samuel Horváth)
 - FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable Training (Filipp Zmushko, Aleksandr Beznosikov, Martin Takac, Samuel Horváth)
 - Towards understanding of orthogonalization in Muon (Valentyn Boreiko, Zhiqi Bu, Sheng Zha)
 - FGFP: A Fractional Gaussian Filter and Pruning for Deep Neural Networks Compression (Kuan-Ting Tu, Po-Hsien Yu, Yu-Syuan Tseng, Shao-Yi Chien)
 - Gatekeeper: Improving Model Cascades Through Confidence Tuning (Stephan Rabanser, Nathalie Rauschmayr, Achin Kulshrestha, Petra Poklukar, Wittawat Jitkrittum, Sean Augenstein, Congchao Wang, Federico Tombari)
 - DiffusionBlocks: Continuous-Time Blockwise Training Through Score-Based Diffusion Models (Makoto Shing, Takuya Akiba)
 - Compression of Large Language Models by Neuron Summary (Yancheng Wang, Dongfang Sun, Yingzhen Yang)
 - Predictive Scheduling for Efficient Inference-Time Reasoning in Large Language Models (Katrina Brown, Aneesh Muppidi, Rana Shahout)
 - FAST: Federated Active Learning with Foundation Models for Communication-efficient Sampling and Training (Haoyuan Li, Mathias Funk, Jindong Wang, Aaqib Saeed)
 - Overcoming label shift in targeted federated learning (Adam Breitholtz, Edvin Listo Zec, Fredrik Johansson)
 - Token-Efficient RL for LLM Reasoning (Alan Lee, Harry Tong)
 - Dynamic Guardian Models: Realtime Content Moderation With User-Defined Policies (Monte Hoover, Vatsal Baherwani, Neel Jain, Khalid Saifullah, Joseph Vincent, Chirag Jain, Melissa Rad, C. Bayan Bruss, Ashwinee Panda, Tom Goldstein)
 - First Provable Guarantees for Practical Private FL: Beyond Restrictive Assumptions (Egor Shulgin, Grigory Malinovsky, Sarit Khirirat, Peter Richtarik)
 - Lion Cub: Minimizing Communication Overhead in Distributed Lion (Satoki Ishikawa, Tal Ben-Nun, Brian Van Essen, Rio Yokota, Nikoli Dryden)
 - SPAM: Stochastic Proximal Point Method with Momentum Variance Reduction for Non-convex Cross-Device Federated Learning (Avetik Karagulyan, Egor Shulgin, Abdurakhmon Sadiev, Peter Richtarik)