Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

John Schulman

๐Ÿ‘ค Speaker
528 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So you have to set up a very good prompt with some examples.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So,

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So people at OpenAI were working on just taking the base models and making them easier to prompt so that if you just wrote a question, it would answer the question instead of giving you more questions or something.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So we had these instruction following models, which were kind of like base models, but a little easier to use.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

And those are the original ones deployed in the API.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

or after GPT-3, those were the next generation of models.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Then at the same time, there were definitely a lot of people thinking about chat.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So Google had some papers, like they had Lambda and earlier Mina.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So they had these chatbots and it was more like,

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

It was more like a base model that was really specialized to the task of chat, really good at chat.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

I think at least looking at the examples from the paper, it was more used for fun applications where the model would take on some persona and pretend to be that persona.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

It was not so functional like help me refactor my code.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So yeah, there are definitely people thinking about chat.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

I had worked on a project before looking at chat called WebGPT, which was more about doing question answering with the help of web browsing and retrieval.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

Well, when you do question answering, it really wants to be in a chat because you always want to ask follow-up questions or sometimes the model should ask a clarifying question because the question's ambiguous.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So it was kind of clear after we did the first version of that that the next version should be conversational.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So anyway, we started working on a conversational chat assistant and

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

This was built on top of GPD 3.5, which was done training at the beginning of 2022.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

And that model was quite good at language and code.

Dwarkesh Podcast
John Schulman (OpenAI Cofounder) - Reasoning, RLHF, & Plan for 2027 AGI

So we quickly realized that it was actually quite good at coding help.