Program

The workshop will take place on the 18th of July 2025. Below is the schedule of the day:

Local Time	Session	Description
08:30 - 08:45	Introduction from Organizers	Introductory keynote from organizers on the topic, submissions, and scope of the venue.
08:45 - 09:30	Invited Keynote #1	Scaling Down: Optimizing Foundation Models for Edge Deployment Zechun Liu (Meta)
09:30 - 10:00	Contributed Talks - Session 1	- Too Big to Think: Capacity, Memorization, and Generalization in Pre-Trained Transformers (Joshua Barron, Devin White) - Kinetics: Rethinking Test-Time Scaling Laws (Ranajoy Sadhukhan, Zhuoming Chen, Haizhong Zheng, et al.)
10:00 - 10:15	Coffee Break	Networking opportunities.
10:15 - 11:00	Invited Keynote #2	Training Models in Low Precision: the Promise, the Limitations, and the Scaling Laws Dan Alistarh (ISTA)
11:00 - 11:30	Contributed Talks - Session 2	- Preserve then Quantize: Dominant-Subspace Guided Low-Rank Reconstruction (Yoonjun Cho, Dongjae Jeon, Soeun Kim, Albert No) - WhisperKit: On-device Real-time ASR with Billion-Scale Transformers (Berkin Durmus, Arda Okan, Eduardo Pacheco, et al.)
11:30 - 12:15	Invited Keynote #3	Invited keynote by Song Han (MIT).
12:15 - 13:00	Lunch Break	Networking opportunities.
13:00 - 13:45	Poster Session #1	- Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation for Federated Learning (Grigory Malinovsky, Umberto Michieli, Hasan Hammoud, Taha Ceritli, Hayder Elesedy, Mete Ozay, Peter Richtarik) - Unlocking the Potential of Extremely Low-Bit Sparse Transformers through Adaptive Multi-bit Supermasks and Random Weights (Yasuyuki Okoshi, Hikari Otsuka, Junnosuke Suzuki, Daichi Fujiki, Masato Motomura) - Addition is almost all you need: Compressing neural networks with double binary factorization (Vladimír Boža, Vladimír Macko) - Capability Transfer from Large to Small Models with Synthetically-Generated Data (Lillian Sun, Emma Yang, Arif Dayi) - TensorSLM: Energy-efficient Embedding Compression of Sub-billion Parameter Language Models on Low-end Devices (Mingxue Xu, Yao Lei Xu, Danilo Mandic) - Leveraging Coordinate Momentum in SignSGD and Muon: Memory-Optimized Zero-Order LLM Fine-Tuning (Egor Petrov, Evseev Grigoriy, Aleksey Antonov, Andrey Veprikov, Pavel Plyusnin, Nikolay Bushkov, Stanislav Moiseev, Aleksandr Beznosikov) - Higher Acceptance Rates for Speculative Decoding with Randomised Drafting (William Toner, Martin Asenov, Rajkarn Singh, Artjom Joosen) - Zoop it! Efficient Zero-Order Optimization with Output Perturbation (Xixi Hu, Bo Liu, qiang liu, Xiaocong Du, Bhargav Bhushanam, Louis Feng, Chengyue Gong, Kaizhao Liang) - MatMuls are Enough for Efficient and Performant Linear-Time Attention (Andrew Argatkiny, Ilya Makarov) - Zeroth-Order Optimization is Secretly Single-Step Policy Optimization (Junbin Qiu, Zhengpeng Xie, Xiangda Yan, Yongjie Yang, Yao Shu) - Spec-LLaVA: Accelerating Vision-Language Models with Dynamic Tree-Based Speculative Decoding (Mingxiao Huo, Jiayi Zhang, Hewei Wang, Jinfeng Xu, Zheyu Chen, Huilin Tai, Ian Chen) - LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning (Nurbek Tastan, Stefanos Laskaridis, Martin Takac, Karthik Nandakumar, Samuel Horváth) - FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable Training (Filipp Zmushko, Aleksandr Beznosikov, Martin Takac, Samuel Horváth) - Towards understanding of orthogonalization in Muon (Valentyn Boreiko, Zhiqi Bu, Sheng Zha)
13:30 - 14:15	Invited Keynote #4	Enabling Frontier AI Experiences on the Edge Fartash Faghri (Apple)
14:15 - 14:45	Contributed Talks - Session 3	- Offloaded Reasoning: Efficient Inference for Large Language Models via Modular Reasoning and Refinement (Ishan Jindal, Jayant Taneja, Badrinath Chandana, et al.) - Enhancing Reasoning Capabilities of Small Language Models with Blueprints and Prompt Template Search (Dongge Han, Menglin Xia, Daniel Madrigal, et al.)
14:45 - 15:00	Coffee Break	Networking opportunities.
15:00 - 15:45	Poster Session #2	- FGFP: A Fractional Gaussian Filter and Pruning for Deep Neural Networks Compression (Kuan-Ting Tu, Po-Hsien Yu, Yu-Syuan Tseng, Shao-Yi Chien) - Gatekeeper: Improving Model Cascades Through Confidence Tuning (Stephan Rabanser, Nathalie Rauschmayr, Achin Kulshrestha, Petra Poklukar, Wittawat Jitkrittum, Sean Augenstein, Congchao Wang, Federico Tombari) - DiffusionBlocks: Continuous-Time Blockwise Training Through Score-Based Diffusion Models (Makoto Shing, Takuya Akiba) - Compression of Large Language Models by Neuron Summary (Yancheng Wang, Dongfang Sun, Yingzhen Yang) - Predictive Scheduling for Efficient Inference-Time Reasoning in Large Language Models (Katrina Brown, Aneesh Muppidi, Rana Shahout) - FAST: Federated Active Learning with Foundation Models for Communication-efficient Sampling and Training (Haoyuan Li, Mathias Funk, Jindong Wang, Aaqib Saeed) - Overcoming label shift in targeted federated learning (Adam Breitholtz, Edvin Listo Zec, Fredrik Johansson) - Token-Efficient RL for LLM Reasoning (Alan Lee, Harry Tong) - Dynamic Guardian Models: Realtime Content Moderation With User-Defined Policies (Monte Hoover, Vatsal Baherwani, Neel Jain, Khalid Saifullah, Joseph Vincent, Chirag Jain, Melissa Rad, C. Bayan Bruss, Ashwinee Panda, Tom Goldstein) - First Provable Guarantees for Practical Private FL: Beyond Restrictive Assumptions (Egor Shulgin, Grigory Malinovsky, Sarit Khirirat, Peter Richtarik) - Lion Cub: Minimizing Communication Overhead in Distributed Lion (Satoki Ishikawa, Tal Ben-Nun, Brian Van Essen, Rio Yokota, Nikoli Dryden) - SPAM: Stochastic Proximal Point Method with Momentum Variance Reduction for Non-convex Cross-Device Federated Learning (Avetik Karagulyan, Egor Shulgin, Abdurakhmon Sadiev, Peter Richtarik)
15:30 - 16:15	Invited Keynote #5	Towards Principled Design of SLM Agents for Edge Devices Kangwook Lee (University of Wisconsin-Madison & KRAFTON).
16:15 - 16:30	Best Paper/Poster Awards	Announcement of awards.
16:30 - 17:15	Panel Session	Panel session of invitees, keynote speakers, and Q&A from the audience. - Kangwook Lee (University of Wisconsin-Madison & KRAFTON) - Zechun Liu (Meta) - Fartash Faghri (Apple)