TTODLer-FM

Program

The workshop will take place on the 18th of July 2025. Below is the schedule of the day:

Local Time Session Description
08:30 - 08:45 Introduction from Organizers Introductory keynote from organizers on the topic, submissions, and scope of the venue.
08:45 - 09:30 Invited Keynote #1 Scaling Down: Optimizing Foundation Models for Edge Deployment
Zechun Liu (Meta)
09:30 - 10:00 Contributed Talks - Session 1 - Too Big to Think: Capacity, Memorization, and Generalization in Pre-Trained Transformers (Joshua Barron, Devin White)
- Kinetics: Rethinking Test-Time Scaling Laws (Ranajoy Sadhukhan, Zhuoming Chen, Haizhong Zheng, et al.)
10:00 - 10:15 Coffee Break Networking opportunities.
10:15 - 11:00 Invited Keynote #2 Training Models in Low Precision: the Promise, the Limitations, and the Scaling Laws
Dan Alistarh (ISTA)
11:00 - 11:30 Contributed Talks - Session 2 - Preserve then Quantize: Dominant-Subspace Guided Low-Rank Reconstruction (Yoonjun Cho, Dongjae Jeon, Soeun Kim, Albert No)
- WhisperKit: On-device Real-time ASR with Billion-Scale Transformers (Berkin Durmus, Arda Okan, Eduardo Pacheco, et al.)
11:30 - 12:15 Invited Keynote #3 Invited keynote by Song Han (MIT).
12:15 - 13:00 Lunch Break Networking opportunities.
13:00 - 13:45 Poster Session #1 - Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation for Federated Learning (Grigory Malinovsky, Umberto Michieli, Hasan Hammoud, Taha Ceritli, Hayder Elesedy, Mete Ozay, Peter Richtarik)
- Unlocking the Potential of Extremely Low-Bit Sparse Transformers through Adaptive Multi-bit Supermasks and Random Weights (Yasuyuki Okoshi, Hikari Otsuka, Junnosuke Suzuki, Daichi Fujiki, Masato Motomura)
- Addition is almost all you need: Compressing neural networks with double binary factorization (Vladimír Boža, Vladimír Macko)
- Capability Transfer from Large to Small Models with Synthetically-Generated Data (Lillian Sun, Emma Yang, Arif Dayi)
- TensorSLM: Energy-efficient Embedding Compression of Sub-billion Parameter Language Models on Low-end Devices (Mingxue Xu, Yao Lei Xu, Danilo Mandic)
- Leveraging Coordinate Momentum in SignSGD and Muon: Memory-Optimized Zero-Order LLM Fine-Tuning (Egor Petrov, Evseev Grigoriy, Aleksey Antonov, Andrey Veprikov, Pavel Plyusnin, Nikolay Bushkov, Stanislav Moiseev, Aleksandr Beznosikov)
- Higher Acceptance Rates for Speculative Decoding with Randomised Drafting (William Toner, Martin Asenov, Rajkarn Singh, Artjom Joosen)
- Zoop it! Efficient Zero-Order Optimization with Output Perturbation (Xixi Hu, Bo Liu, qiang liu, Xiaocong Du, Bhargav Bhushanam, Louis Feng, Chengyue Gong, Kaizhao Liang)
- MatMuls are Enough for Efficient and Performant Linear-Time Attention (Andrew Argatkiny, Ilya Makarov)
- Zeroth-Order Optimization is Secretly Single-Step Policy Optimization (Junbin Qiu, Zhengpeng Xie, Xiangda Yan, Yongjie Yang, Yao Shu)
- Spec-LLaVA: Accelerating Vision-Language Models with Dynamic Tree-Based Speculative Decoding (Mingxiao Huo, Jiayi Zhang, Hewei Wang, Jinfeng Xu, Zheyu Chen, Huilin Tai, Ian Chen)
- LoFT: Low-Rank Adaptation That Behaves Like Full Fine-Tuning (Nurbek Tastan, Stefanos Laskaridis, Martin Takac, Karthik Nandakumar, Samuel Horváth)
- FRUGAL: Memory-Efficient Optimization by Reducing State Overhead for Scalable Training (Filipp Zmushko, Aleksandr Beznosikov, Martin Takac, Samuel Horváth)
- Towards understanding of orthogonalization in Muon (Valentyn Boreiko, Zhiqi Bu, Sheng Zha)
13:30 - 14:15 Invited Keynote #4 Enabling Frontier AI Experiences on the Edge
Fartash Faghri (Apple)
14:15 - 14:45 Contributed Talks - Session 3 - Offloaded Reasoning: Efficient Inference for Large Language Models via Modular Reasoning and Refinement (Ishan Jindal, Jayant Taneja, Badrinath Chandana, et al.)
- Enhancing Reasoning Capabilities of Small Language Models with Blueprints and Prompt Template Search (Dongge Han, Menglin Xia, Daniel Madrigal, et al.)
14:45 - 15:00 Coffee Break Networking opportunities.
15:00 - 15:45 Poster Session #2 - FGFP: A Fractional Gaussian Filter and Pruning for Deep Neural Networks Compression (Kuan-Ting Tu, Po-Hsien Yu, Yu-Syuan Tseng, Shao-Yi Chien)
- Gatekeeper: Improving Model Cascades Through Confidence Tuning (Stephan Rabanser, Nathalie Rauschmayr, Achin Kulshrestha, Petra Poklukar, Wittawat Jitkrittum, Sean Augenstein, Congchao Wang, Federico Tombari)
- DiffusionBlocks: Continuous-Time Blockwise Training Through Score-Based Diffusion Models (Makoto Shing, Takuya Akiba)
- Compression of Large Language Models by Neuron Summary (Yancheng Wang, Dongfang Sun, Yingzhen Yang)
- Predictive Scheduling for Efficient Inference-Time Reasoning in Large Language Models (Katrina Brown, Aneesh Muppidi, Rana Shahout)
- FAST: Federated Active Learning with Foundation Models for Communication-efficient Sampling and Training (Haoyuan Li, Mathias Funk, Jindong Wang, Aaqib Saeed)
- Overcoming label shift in targeted federated learning (Adam Breitholtz, Edvin Listo Zec, Fredrik Johansson)
- Token-Efficient RL for LLM Reasoning (Alan Lee, Harry Tong)
- Dynamic Guardian Models: Realtime Content Moderation With User-Defined Policies (Monte Hoover, Vatsal Baherwani, Neel Jain, Khalid Saifullah, Joseph Vincent, Chirag Jain, Melissa Rad, C. Bayan Bruss, Ashwinee Panda, Tom Goldstein)
- First Provable Guarantees for Practical Private FL: Beyond Restrictive Assumptions (Egor Shulgin, Grigory Malinovsky, Sarit Khirirat, Peter Richtarik)
- Lion Cub: Minimizing Communication Overhead in Distributed Lion (Satoki Ishikawa, Tal Ben-Nun, Brian Van Essen, Rio Yokota, Nikoli Dryden)
- SPAM: Stochastic Proximal Point Method with Momentum Variance Reduction for Non-convex Cross-Device Federated Learning (Avetik Karagulyan, Egor Shulgin, Abdurakhmon Sadiev, Peter Richtarik)
15:30 - 16:15 Invited Keynote #5 Towards Principled Design of SLM Agents for Edge Devices
Kangwook Lee (University of Wisconsin-Madison & KRAFTON).
16:15 - 16:30 Best Paper/Poster Awards Announcement of awards.
16:30 - 17:15 Panel Session Panel session of invitees, keynote speakers, and Q&A from the audience.
- Kangwook Lee (University of Wisconsin-Madison & KRAFTON)
- Zechun Liu (Meta)
- Fartash Faghri (Apple)