OpenAI's O1 And Journey Learning OVERFIT: AI, Machine Learning, And Deep Learning Made Simple podcast

Artwork

Innhold levert av Brian Carter. Alt podcastinnhold, inkludert episoder, grafikk og podcastbeskrivelser, lastes opp og leveres direkte av Brian Carter eller deres podcastplattformpartner. Hvis du tror at noen bruker det opphavsrettsbeskyttede verket ditt uten din tillatelse, kan du følge prosessen skissert her https://no.player.fm/legal.

OVERFIT: AI, Machine Learning, and Deep Learning Made Simple « »
OpenAI's o1 and Journey Learning

5M ago 7:28

Del

MP3•Episoder hjem

Innhold levert av Brian Carter. Alt podcastinnhold, inkludert episoder, grafikk og podcastbeskrivelser, lastes opp og leveres direkte av Brian Carter eller deres podcastplattformpartner. Hvis du tror at noen bruker det opphavsrettsbeskyttede verket ditt uten din tillatelse, kan du følge prosessen skissert her https://no.player.fm/legal.

This paper details the authors' research journey to replicate OpenAI's "O1" language model, which is designed to solve complex reasoning tasks. The researchers document their process with detailed insights, hypotheses, and challenges encountered. They present a novel paradigm called "Journey Learning" that enables models to learn the complete exploration process, including trial and error, reflection, and backtracking, which they argue outperforms traditional "shortcut learning" methods. The authors also propose a multi-step evaluation approach that utilizes reasoning trees, reward models, and a human-AI collaborative annotation pipeline to generate high-quality long-form reasoning data.

Read more: https://github.com/GAIR-NLP/O1-Journey/blob/main/resource/report.pdf

… continue reading

71 episoder

Artwork

OpenAI's o1 and Journey Learning

OVERFIT: AI, Machine Learning, and Deep Learning Made Simple

published 5M ago

Del

MP3•Episoder hjem

Innhold levert av Brian Carter. Alt podcastinnhold, inkludert episoder, grafikk og podcastbeskrivelser, lastes opp og leveres direkte av Brian Carter eller deres podcastplattformpartner. Hvis du tror at noen bruker det opphavsrettsbeskyttede verket ditt uten din tillatelse, kan du følge prosessen skissert her https://no.player.fm/legal.

This paper details the authors' research journey to replicate OpenAI's "O1" language model, which is designed to solve complex reasoning tasks. The researchers document their process with detailed insights, hypotheses, and challenges encountered. They present a novel paradigm called "Journey Learning" that enables models to learn the complete exploration process, including trial and error, reflection, and backtracking, which they argue outperforms traditional "shortcut learning" methods. The authors also propose a multi-step evaluation approach that utilizes reasoning trees, reward models, and a human-AI collaborative annotation pipeline to generate high-quality long-form reasoning data.

Read more: https://github.com/GAIR-NLP/O1-Journey/blob/main/resource/report.pdf

… continue reading

71 episoder

Alle episoder

×

Velkommen til Player FM!

Player FM scanner netter for høykvalitets podcaster som du kan nyte nå. Det er den beste podcastappen og fungerer på Android, iPhone og internett. Registrer deg for å synkronisere abonnement på flere enheter.

Lytt til 500+ tema

Lytt til dette showet mens du utforsker