Interviewing Ross Taylor on LLM reasoning, Llama fine-tuning, Galactica, agents

Interconnects Audio

Contenuto fornito da Nathan Lambert. Tutti i contenuti dei podcast, inclusi episodi, grafica e descrizioni dei podcast, vengono caricati e forniti direttamente da Nathan Lambert o dal partner della piattaforma podcast. Se ritieni che qualcuno stia utilizzando la tua opera protetta da copyright senza la tua autorizzazione, puoi seguire la procedura descritta qui https://it.player.fm/legal.

3M ago 1:02:22

MP3•Pagina principale dell'episodio

Fetch error

Hmmm there seems to be a problem fetching this series right now. Last successful fetch was on October 02, 2024 13:35 (23d ago)

What now? This series will be checked again in the next hour. If you believe it should be working, please verify the publisher's feed link below is valid and includes actual episode links. You can contact support to request the feed be immediately fetched.

I had the pleasure of Talking with Ross Taylor (https://x.com/rosstaylor90), who has a great spectrum of unique experiences in the language modeling space — evaluation experience, Galactica lead author, Llama post training, etc. This is a really great conversation on the frontier of language model (LM) reasoning, LM deployments and demos, LM’s for science, RLHF, and other topics. I’ve been trying to get Ross to come on for a bit. He’s one of those people in the LM space that doesn’t speak too much, but when you do, you listen.

Ross Taylor was previously an LLM lead at Meta AI, heading up the reasoning team. Previously he led the early work on LLM agents, and was the research lead on the Galactica project. Before that, he was a co-founder of Papers with Code, which was acquired by Meta in 2019. Before that, he has worked as a quant in sports betting and finance, and before that a policy advisor for the UK Government. He is currently working on a new startup.

More details: https://www.interconnects.ai/p/interviewing-ross-taylor-on-llm-reasoning

00:00:00 Introduction of Ross Taylor and his background
00:02:12 Papers with Code
00:09:58 Galactica, goals, controversy, legacy
00:18:12 Technical details of the Galactica model
00:23:18 Potential for language models to make scientific discoveries
00:25:21 Defining and improving reasoning in language models
00:32:38 Process-based reward models and their potential applications
00:35:00 Generating synthetic data for SFT
00:40:23 Evaluating the effectiveness of language models as judges for human preference data
00:42:43 Considerations for creating base models that are easy to fine-tune
00:46:45 Balancing SFT and RLHF
00:54:13 Characteristics of successful post-training teams
00:58:26 Future directions for language model development

58 episodi

#Science #Tech #Nathan Lambert #Artificial Intelligence #Machine Learning