1,732 subscribers
Gå frakoblet med Player FM -appen!
Podcaster verdt å lytte til
SPONSET
1 ICE CUBE in the Trap! | 85 South Show Podcast 52:43
Teaching Large Language Models to Reason with Reinforcement Learning with Alex Havrilla - #680
Manage episode 412923929 series 2355587
Today we're joined by Alex Havrilla, a PhD student at Georgia Tech, to discuss "Teaching Large Language Models to Reason with Reinforcement Learning." Alex discusses the role of creativity and exploration in problem solving and explores the opportunities presented by applying reinforcement learning algorithms to the challenge of improving reasoning in large language models. Alex also shares his research on the effect of noise on language model training, highlighting the robustness of LLM architecture. Finally, we delve into the future of RL, and the potential of combining language models with traditional methods to achieve more robust AI reasoning.
The complete show notes for this episode can be found at twimlai.com/go/680.
733 episoder
Teaching Large Language Models to Reason with Reinforcement Learning with Alex Havrilla - #680
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
Manage episode 412923929 series 2355587
Today we're joined by Alex Havrilla, a PhD student at Georgia Tech, to discuss "Teaching Large Language Models to Reason with Reinforcement Learning." Alex discusses the role of creativity and exploration in problem solving and explores the opportunities presented by applying reinforcement learning algorithms to the challenge of improving reasoning in large language models. Alex also shares his research on the effect of noise on language model training, highlighting the robustness of LLM architecture. Finally, we delve into the future of RL, and the potential of combining language models with traditional methods to achieve more robust AI reasoning.
The complete show notes for this episode can be found at twimlai.com/go/680.
733 episoder
Tutti gli episodi
×1 Evolving MLOps Platforms for Generative AI and Agents with Abhijit Bose - #714 58:08
1 Why Agents Are Stupid & What We Can Do About It with Dan Jeffries - #713 1:08:49
1 Automated Reasoning to Prevent LLM Hallucination with Byron Cook - #712 56:48
1 AI at the Edge: Qualcomm AI Research at NeurIPS 2024 with Arash Behboodi - #711 54:47
1 AI for Network Management with Shirley Wu - #710 53:44
1 Why Your RAG System Is Broken, and How to Fix It with Jason Liu - #709 58:03
1 An Agentic Mixture of Experts for DevOps with Sunil Mallya - #708 1:15:09
1 Building AI Voice Agents with Scott Stephenson - #707 1:01:44
1 Is Artificial Superintelligence Imminent? with Tim Rocktäschel - #706 55:52
1 ML Models for Safety-Critical Systems with Lucas García - #705 1:16:06
1 AI Agents: Substance or Snake Oil with Arvind Narayanan - #704 54:22
1 AI Agents for Data Analysis with Shreya Shankar - #703 48:24
1 Stealing Part of a Production Language Model with Nicholas Carlini - #702 1:03:30
1 Supercharging Developer Productivity with ChatGPT and Claude with Simon Willison - #701 1:14:15
1 Automated Design of Agentic Systems with Shengran Hu - #700 59:30
Velkommen til Player FM!
Player FM scanner netter for høykvalitets podcaster som du kan nyte nå. Det er den beste podcastappen og fungerer på Android, iPhone og internett. Registrer deg for å synkronisere abonnement på flere enheter.