The Art of Artificial Reasoning for Small Language Models

Jul 13, 2025

Sprecher:innen

Über

Large reasoning models such as Deepseek's R1 and OpenAI's O1/O3 have demonstrated the power of reinforcement learning to enable a new axis of scaling — test-time compute. This has catalyzed intensive research across the open-source community, generating rapid progress but also seemingly contradictory results. In this talk, I will present critical insights into the conditions under which reinforcement learning thrives or struggles, and how we can induce stronger reasoning capabilities from small language models, closing the gap against the larger counterparts in specific domains.

Organisator

Kategorien

Gefällt euch das Format? Vertraut auf SlidesLive, um euer nächstes Event festzuhalten!

Professionelle Aufzeichnung und Livestreaming – weltweit.

Freigeben

Empfohlene Videos

Präsentationen, deren Thema, Kategorie oder Sprecher:in ähnlich sind

Interessiert an Vorträgen wie diesem? ICML 2025 folgen