“o3” by Zach Stein-Perlman
Manage episode 456844561 series 3364758
Contenuto fornito da LessWrong. Tutti i contenuti dei podcast, inclusi episodi, grafica e descrizioni dei podcast, vengono caricati e forniti direttamente da LessWrong o dal partner della piattaforma podcast. Se ritieni che qualcuno stia utilizzando la tua opera protetta da copyright senza la tua autorizzazione, puoi seguire la procedura descritta qui https://it.player.fm/legal.
I'm editing this post.
OpenAI announced (but hasn't released) o3 (skipping o2 for trademark reasons).
It gets 25% on FrontierMath, smashing the previous SoTA of 2%. (These are really hard math problems.) Wow.
72% on SWE-bench Verified, beating o1's 49%.
Also 88% on ARC-AGI.
---
First published:
December 20th, 2024
Source:
https://www.lesswrong.com/posts/Ao4enANjWNsYiSFqc/o3
---
Narrated by TYPE III AUDIO.
…
continue reading
OpenAI announced (but hasn't released) o3 (skipping o2 for trademark reasons).
It gets 25% on FrontierMath, smashing the previous SoTA of 2%. (These are really hard math problems.) Wow.
72% on SWE-bench Verified, beating o1's 49%.
Also 88% on ARC-AGI.
---
First published:
December 20th, 2024
Source:
https://www.lesswrong.com/posts/Ao4enANjWNsYiSFqc/o3
---
Narrated by TYPE III AUDIO.
485 episodi