On-device GenAI

Enabling Frontier AI Experiences on the Edge

Frontier models typically require substantial computational resources, not only for training on large-scale datasets but also for inference, necessitating their deployment on servers. However, several factors are driving a growing interest in on-device AI. First, the computational power of edge devices has increased dramatically and continues to grow. Second, the widespread adoption of high-performing AI applications is straining the capacity of server-based solutions, making the naturally distributed and scalable nature of on-device AI an attractive alternative for serving a massive user base. Finally, and most importantly, on-device AI offers inherent privacy, enabling the use of personal on-device data to create more powerful and personalized experiences in a privacy-preserving manner. In this talk, we present multiple strategies for enabling effective on-device AI, including specialized training methods, architectural designs for on-device foundation models, and multi-task multi-modal models through techniques like model merging. We will showcase state-of-the-art on-device AI capabilities that have been made possible by co-designing novel model architectures and training methods with hardware advancements. Furthermore, we will demonstrate real-time generative AI applications running entirely on-device, highlighting new possibilities for user experiences. We will conclude by identifying key open research questions and future paradigms for the advancement of on-device AI.

Overview Program