Artwork

Contenuto fornito da Super Data Science: ML & AI Podcast with Jon Krohn and Jon Krohn. Tutti i contenuti dei podcast, inclusi episodi, grafica e descrizioni dei podcast, vengono caricati e forniti direttamente da Super Data Science: ML & AI Podcast with Jon Krohn and Jon Krohn o dal partner della piattaforma podcast. Se ritieni che qualcuno stia utilizzando la tua opera protetta da copyright senza la tua autorizzazione, puoi seguire la procedura descritta qui https://it.player.fm/legal.
Player FM - App Podcast
Vai offline con l'app Player FM !

791: Reinforcement Learning from Human Feedback (RLHF), with Dr. Nathan Lambert

57:10
 
Condividi
 

Manage episode 423002457 series 1278026
Contenuto fornito da Super Data Science: ML & AI Podcast with Jon Krohn and Jon Krohn. Tutti i contenuti dei podcast, inclusi episodi, grafica e descrizioni dei podcast, vengono caricati e forniti direttamente da Super Data Science: ML & AI Podcast with Jon Krohn and Jon Krohn o dal partner della piattaforma podcast. Se ritieni che qualcuno stia utilizzando la tua opera protetta da copyright senza la tua autorizzazione, puoi seguire la procedura descritta qui https://it.player.fm/legal.
Reinforcement learning through human feedback (RLHF) has come a long way. In this episode, research scientist Nathan Lambert talks to Jon Krohn about the technique’s origins of the technique. He also walks through other ways to fine-tune LLMs, and how he believes generative AI might democratize education. This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and AWS Trainium (https://go.aws/3ycV6K0), and Crawlbase (https://crawlbase.com), the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • Why it is important that AI is open [03:13] • The efficacy and scalability of direct preference optimization [07:32] • Robotics and LLMs [14:32] • The challenges to aligning reward models with human preferences [23:00] • How to make sure AI’s decision making on preferences reflect desirable behavior [28:52] • Why Nathan believes AI is closer to alchemy than science [37:38] Additional materials: www.superdatascience.com/791
  continue reading

802 episodi

Artwork
iconCondividi
 
Manage episode 423002457 series 1278026
Contenuto fornito da Super Data Science: ML & AI Podcast with Jon Krohn and Jon Krohn. Tutti i contenuti dei podcast, inclusi episodi, grafica e descrizioni dei podcast, vengono caricati e forniti direttamente da Super Data Science: ML & AI Podcast with Jon Krohn and Jon Krohn o dal partner della piattaforma podcast. Se ritieni che qualcuno stia utilizzando la tua opera protetta da copyright senza la tua autorizzazione, puoi seguire la procedura descritta qui https://it.player.fm/legal.
Reinforcement learning through human feedback (RLHF) has come a long way. In this episode, research scientist Nathan Lambert talks to Jon Krohn about the technique’s origins of the technique. He also walks through other ways to fine-tune LLMs, and how he believes generative AI might democratize education. This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and AWS Trainium (https://go.aws/3ycV6K0), and Crawlbase (https://crawlbase.com), the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • Why it is important that AI is open [03:13] • The efficacy and scalability of direct preference optimization [07:32] • Robotics and LLMs [14:32] • The challenges to aligning reward models with human preferences [23:00] • How to make sure AI’s decision making on preferences reflect desirable behavior [28:52] • Why Nathan believes AI is closer to alchemy than science [37:38] Additional materials: www.superdatascience.com/791
  continue reading

802 episodi

Alle episoder

×
 
Loading …

Benvenuto su Player FM!

Player FM ricerca sul web podcast di alta qualità che tu possa goderti adesso. È la migliore app di podcast e funziona su Android, iPhone e web. Registrati per sincronizzare le iscrizioni su tutti i tuoi dispositivi.

 

Guida rapida