Artwork

Contenuto fornito da Nathan Lambert. Tutti i contenuti dei podcast, inclusi episodi, grafica e descrizioni dei podcast, vengono caricati e forniti direttamente da Nathan Lambert o dal partner della piattaforma podcast. Se ritieni che qualcuno stia utilizzando la tua opera protetta da copyright senza la tua autorizzazione, puoi seguire la procedura descritta qui https://it.player.fm/legal.
Player FM - App Podcast
Vai offline con l'app Player FM !

Reverse engineering OpenAI's o1

18:52
 
Condividi
 

Manage episode 440336212 series 3590272
Contenuto fornito da Nathan Lambert. Tutti i contenuti dei podcast, inclusi episodi, grafica e descrizioni dei podcast, vengono caricati e forniti direttamente da Nathan Lambert o dal partner della piattaforma podcast. Se ritieni che qualcuno stia utilizzando la tua opera protetta da copyright senza la tua autorizzazione, puoi seguire la procedura descritta qui https://it.player.fm/legal.

What productionizing test-time compute shows us about the future of AI. Exploration has landed in language model training.
This is AI generated audio with Python and 11Labs.
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/reverse-engineering-openai-o1

00:00 Reverse engineering OpenAI's o1
01:52 From Q-star to Strawberry to o1
05:13 Training o1 with reinforcement learning
09:24 What is o1 doing when given a prompt?
11:49 Questions to consider to understand o1's structure
11:56 1. How does an RL-trained language model act?
12:38 2. Is it an online / test-time search?
14:20 3. Is it one model at inference?
15:29 Open-source o1, the future of o1, and the future of AI

Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_014.png
Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_016.png
Fig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_018.png
Fig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_020.png
Fig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_024.png
Fig 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_026.png
Fig 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_034.png
Fig 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_048.png

  continue reading

55 episodi

Artwork
iconCondividi
 
Manage episode 440336212 series 3590272
Contenuto fornito da Nathan Lambert. Tutti i contenuti dei podcast, inclusi episodi, grafica e descrizioni dei podcast, vengono caricati e forniti direttamente da Nathan Lambert o dal partner della piattaforma podcast. Se ritieni che qualcuno stia utilizzando la tua opera protetta da copyright senza la tua autorizzazione, puoi seguire la procedura descritta qui https://it.player.fm/legal.

What productionizing test-time compute shows us about the future of AI. Exploration has landed in language model training.
This is AI generated audio with Python and 11Labs.
Source code: https://github.com/natolambert/interconnects-tools
Original post: https://www.interconnects.ai/p/reverse-engineering-openai-o1

00:00 Reverse engineering OpenAI's o1
01:52 From Q-star to Strawberry to o1
05:13 Training o1 with reinforcement learning
09:24 What is o1 doing when given a prompt?
11:49 Questions to consider to understand o1's structure
11:56 1. How does an RL-trained language model act?
12:38 2. Is it an online / test-time search?
14:20 3. Is it one model at inference?
15:29 Open-source o1, the future of o1, and the future of AI

Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_014.png
Fig 2: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_016.png
Fig 3: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_018.png
Fig 4: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_020.png
Fig 5: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_024.png
Fig 6: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_026.png
Fig 7: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_034.png
Fig 8: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/o1/img_048.png

  continue reading

55 episodi

Tutti gli episodi

×
 
Loading …

Benvenuto su Player FM!

Player FM ricerca sul web podcast di alta qualità che tu possa goderti adesso. È la migliore app di podcast e funziona su Android, iPhone e web. Registrati per sincronizzare le iscrizioni su tutti i tuoi dispositivi.

 

Guida rapida