Program
The workshop will take place on the 18th of July 2025. Below is the schedule of the day:
Local Time | Session | Description |
---|---|---|
08:30 - 08:45 | Introduction from Organizers | Introductory keynote from organizers on the topic, submissions, and scope of the venue. |
08:45 - 09:30 | Invited Keynote #1 | Scaling Down: Optimizing Foundation Models for Edge Deployment Zechun Liu (Meta) |
09:30 - 10:00 | Contributed Talks - Session 1 | - Too Big to Think: Capacity, Memorization, and Generalization in Pre-Trained Transformers (Joshua Barron, Devin White) - Kinetics: Rethinking Test-Time Scaling Laws (Ranajoy Sadhukhan, Zhuoming Chen, Haizhong Zheng, et al.) |
10:00 - 10:15 | Coffee Break | Networking opportunities. |
10:15 - 11:00 | Invited Keynote #2 | Training Models in Low Precision: the Promise, the Limitations, and the Scaling Laws Dan Alistarh (ISTA) |
11:00 - 11:30 | Contributed Talks - Session 2 | - Preserve then Quantize: Dominant-Subspace Guided Low-Rank Reconstruction (Yoonjun Cho, Dongjae Jeon, Soeun Kim, Albert No) - WhisperKit: On-device Real-time ASR with Billion-Scale Transformers (Berkin Durmus, Arda Okan, Eduardo Pacheco, et al.) |
11:30 - 12:15 | Invited Keynote #3 | Invited keynote by Song Han (MIT). |
12:15 - 13:00 | Lunch Break | Networking opportunities. |
13:00 - 13:45 | Poster Session #1 | - Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation for Federated Learning (Grigory Malinovsky, Umberto Michieli, Hasan Hammoud, Taha Ceritli, Hayder Elesedy, Mete Ozay, Peter Richtarik) - Unlocking the Potential of Extremely Low-Bit Sparse Transformers through Adaptive Multi-bit Supermasks and Random Weights (Yasuyuki Okoshi, Hikari Otsuka, Junnosuke Suzuki, Daichi Fujiki, Masato Motomura) - Addition is almost all you need: Compressing neural networks with double binary factorization (Vladimír Boža, Vladimír Macko) - Capability Transfer from Large to Small Models with Synthetically-Generated Data (Lillian Sun, Emma Yang, Arif Dayi) - TensorSLM: Energy-efficient Embedding Compression of Sub-billion Parameter Language Models on Low-end Devices (Mingxue Xu, Yao Lei Xu, Danilo Mandic) - Leveraging Coordinate Momentum in SignSGD and Muon: Memory-Optimized Zero-Order LLM Fine-Tuning (Egor Petrov, Evseev Grigoriy, Aleksey Antonov, Andrey Veprikov, Pavel Plyusnin, Nikolay Bushkov, Stanislav Moiseev, Aleksandr Beznosikov) - Higher Acceptance Rates for Speculative Decoding with Randomised Drafting (William Toner, Martin Asenov, Rajkarn Singh, Artjom Joosen) - Zoop it! Efficient Zero-Order Optimization with Output Perturbation (Xixi Hu, Bo Liu, qiang liu, Xiaocong Du, Bhargav Bhushanam, Louis Feng, Chengyue Gong, Kaizhao Liang) - MatMuls are Enough for Efficient and Performant Linear-Time Attention (Andrew Argatkiny, Ilya Makarov) - Zeroth-Order Optimization is Secretly Single-Step Policy Optimization (Junbin Qiu, Zhengpeng Xie, Xiangda Yan, Yongjie Yang, Yao Shu) - Spec-LLaVA: Accelerating Vision-Language Models with Dynamic Tree-Based Speculative Decoding (Mingxiao Huo, Jiayi Zhang, Hewei Wang, Jinfeng Xu, Zheyu Chen, Huilin Tai, Ian Chen) - LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning (Nurbek Tastan, Stefanos Laskaridis, Martin Takac, Karthik Nandakumar, Samuel Horváth) - FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable Training (Filipp Zmushko, Aleksandr Beznosikov, Martin Takac, Samuel Horváth) - Towards understanding of orthogonalization in Muon (Valentyn Boreiko, Zhiqi Bu, Sheng Zha) |
13:30 - 14:15 | Invited Keynote #4 | Enabling Frontier AI Experiences on the Edge Fartash Faghri (Apple) |
14:15 - 14:45 | Contributed Talks - Session 3 | - Offloaded Reasoning: Efficient Inference for Large Language Models via Modular Reasoning and Refinement (Ishan Jindal, Jayant Taneja, Badrinath Chandana, et al.) - Enhancing Reasoning Capabilities of Small Language Models with Blueprints and Prompt Template Search (Dongge Han, Menglin Xia, Daniel Madrigal, et al.) |
14:45 - 15:00 | Coffee Break | Networking opportunities. |
15:00 - 15:45 | Poster Session #2 | - FGFP: A Fractional Gaussian Filter and Pruning for Deep Neural Networks Compression (Kuan-Ting Tu, Po-Hsien Yu, Yu-Syuan Tseng, Shao-Yi Chien) - Gatekeeper: Improving Model Cascades Through Confidence Tuning (Stephan Rabanser, Nathalie Rauschmayr, Achin Kulshrestha, Petra Poklukar, Wittawat Jitkrittum, Sean Augenstein, Congchao Wang, Federico Tombari) - DiffusionBlocks: Continuous-Time Blockwise Training Through Score-Based Diffusion Models (Makoto Shing, Takuya Akiba) - Compression of Large Language Models by Neuron Summary (Yancheng Wang, Dongfang Sun, Yingzhen Yang) - Predictive Scheduling for Efficient Inference-Time Reasoning in Large Language Models (Katrina Brown, Aneesh Muppidi, Rana Shahout) - FAST: Federated Active Learning with Foundation Models for Communication-efficient Sampling and Training (Haoyuan Li, Mathias Funk, Jindong Wang, Aaqib Saeed) - Overcoming label shift in targeted federated learning (Adam Breitholtz, Edvin Listo Zec, Fredrik Johansson) - Token-Efficient RL for LLM Reasoning (Alan Lee, Harry Tong) - Dynamic Guardian Models: Realtime Content Moderation With User-Defined Policies (Monte Hoover, Vatsal Baherwani, Neel Jain, Khalid Saifullah, Joseph Vincent, Chirag Jain, Melissa Rad, C. Bayan Bruss, Ashwinee Panda, Tom Goldstein) - First Provable Guarantees for Practical Private FL: Beyond Restrictive Assumptions (Egor Shulgin, Grigory Malinovsky, Sarit Khirirat, Peter Richtarik) - Lion Cub: Minimizing Communication Overhead in Distributed Lion (Satoki Ishikawa, Tal Ben-Nun, Brian Van Essen, Rio Yokota, Nikoli Dryden) - SPAM: Stochastic Proximal Point Method with Momentum Variance Reduction for Non-convex Cross-Device Federated Learning (Avetik Karagulyan, Egor Shulgin, Abdurakhmon Sadiev, Peter Richtarik) |
15:30 - 16:15 | Invited Keynote #5 | Towards Principled Design of SLM Agents for Edge Devices Kangwook Lee (University of Wisconsin-Madison & KRAFTON). |
16:15 - 16:30 | Best Paper/Poster Awards | Announcement of awards. |
16:30 - 17:15 | Panel Session | Panel session of invitees, keynote speakers, and Q&A from the audience. - Kangwook Lee (University of Wisconsin-Madison & KRAFTON) - Zechun Liu (Meta) - Fartash Faghri (Apple) |