Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing
430 total appearances

Appearances Over Time

Podcast Appearances

Azeem Azhar's Exponential View
OpenAI’s CPO on what’s coming next: Hardware, GPT-5, Jony Ive, agents, more

You can talk about creative writing evals though.

Azeem Azhar's Exponential View
OpenAI’s CPO on what’s coming next: Hardware, GPT-5, Jony Ive, agents, more

And with creative writing, there's no answer.

Azeem Azhar's Exponential View
OpenAI’s CPO on what’s coming next: Hardware, GPT-5, Jony Ive, agents, more

So how do you grade that, right?

Azeem Azhar's Exponential View
OpenAI’s CPO on what’s coming next: Hardware, GPT-5, Jony Ive, agents, more

That's one problem.

Azeem Azhar's Exponential View
OpenAI’s CPO on what’s coming next: Hardware, GPT-5, Jony Ive, agents, more

The other is like, as you start to take on more complex tasks, you're not just answering questions.

Azeem Azhar's Exponential View
OpenAI’s CPO on what’s coming next: Hardware, GPT-5, Jony Ive, agents, more

you're actually trying to automate some multi-step workflow, there may be ambiguity in the right way to do that.

Azeem Azhar's Exponential View
OpenAI’s CPO on what’s coming next: Hardware, GPT-5, Jony Ive, agents, more

If I'm an AI booking a flight for you,

Azeem Azhar's Exponential View
OpenAI’s CPO on what’s coming next: Hardware, GPT-5, Jony Ive, agents, more

There's not a single way to grade which correct flight, you know.

Azeem Azhar's Exponential View
OpenAI’s CPO on what’s coming next: Hardware, GPT-5, Jony Ive, agents, more

You also get into these really interesting, challenging, subjective ways of how do we actually grade this particular task?

Azeem Azhar's Exponential View
OpenAI’s CPO on what’s coming next: Hardware, GPT-5, Jony Ive, agents, more

And part of having an eval, if you want to at least automate it, is you need to also have a grader for it so that you can very quickly understand how you're doing on that eval.

Azeem Azhar's Exponential View
OpenAI’s CPO on what’s coming next: Hardware, GPT-5, Jony Ive, agents, more

So it is interesting.

Azeem Azhar's Exponential View
OpenAI’s CPO on what’s coming next: Hardware, GPT-5, Jony Ive, agents, more

It's one of the skills that I think is going to be more and more important for PMs over time is the ability to actually create evals for the products that you're building.

Azeem Azhar's Exponential View
OpenAI’s CPO on what’s coming next: Hardware, GPT-5, Jony Ive, agents, more

I mean, actually, more than people realize, I think.

Azeem Azhar's Exponential View
OpenAI’s CPO on what’s coming next: Hardware, GPT-5, Jony Ive, agents, more

I would love to make it over time less of a thing.

Azeem Azhar's Exponential View
OpenAI’s CPO on what’s coming next: Hardware, GPT-5, Jony Ive, agents, more

And I think over time it is.

Azeem Azhar's Exponential View
OpenAI’s CPO on what’s coming next: Hardware, GPT-5, Jony Ive, agents, more

If you go back a year or two, everybody was talking about prompt engineering and it was going to be the skill that everybody had to master in order to do anything with AI.

Azeem Azhar's Exponential View
OpenAI’s CPO on what’s coming next: Hardware, GPT-5, Jony Ive, agents, more

You don't hear it talked about quite as much like that.

Azeem Azhar's Exponential View
OpenAI’s CPO on what’s coming next: Hardware, GPT-5, Jony Ive, agents, more

And I think that's a good thing.

Azeem Azhar's Exponential View
OpenAI’s CPO on what’s coming next: Hardware, GPT-5, Jony Ive, agents, more

Ideally, it matters less and less that for any particular user, if they have a question, they want an AI to do something for them.

Azeem Azhar's Exponential View
OpenAI’s CPO on what’s coming next: Hardware, GPT-5, Jony Ive, agents, more

You shouldn't need to get into like arcana around, did I use the exact right word?