The Art of Artificial Reasoning for Small Language Models

13. července 2025

Řečníci

O prezentaci

Large reasoning models such as Deepseek's R1 and OpenAI's O1/O3 have demonstrated the power of reinforcement learning to enable a new axis of scaling — test-time compute. This has catalyzed intensive research across the open-source community, generating rapid progress but also seemingly contradictory results. In this talk, I will present critical insights into the conditions under which reinforcement learning thrives or struggles, and how we can induce stronger reasoning capabilities from small language models, closing the gap against the larger counterparts in specific domains.

Organizátor

Kategorie

Baví vás formát? Nechte SlidesLive zachytit svou akci!

Profesionální natáčení a streamování po celém světě.

Sdílení

Doporučená videa

Prezentace na podobné téma, kategorii nebo přednášejícího

Zajímají Vás podobná videa? Sledujte ICML 2025