Francois Chollet

Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

You can already try to prompt one of the best models, like the latest Gemini, the latest GPT-4, with tasks from the public evaluation set.

4773.745 View full episode →

Dwarkesh Podcast

Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

And again, the problem is that these tasks are available as JSON files on GitHub.

4782.682 View full episode →

Dwarkesh Podcast

Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

These models are also trained on GitHub.

4788.654 View full episode →

Dwarkesh Podcast

Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

So they're actually trained on these tasks.

4790.678 View full episode →

Dwarkesh Podcast

Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

And yeah, that kind of creates uncertainty about if they can actually solve some of the tasks, is that because they memorized the answer or not?

4793.223 View full episode →

Dwarkesh Podcast

Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

You know, maybe you would be better off trying to create your own private arc-like, very novel test set.

4802.211 View full episode →

Dwarkesh Podcast

Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

Don't make the task difficult, don't make them complex, make them very obvious for humans, but make sure to make them original as much as possible, make them unique, different.

4811.4 View full episode →

Dwarkesh Podcast

Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

and see how much your GPT-4 and so on, or GPT-5 does on them.

4820.628 View full episode →

Dwarkesh Podcast

Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

Yeah, no, Arc is not a perfect benchmark.

4897.948 View full episode →

Dwarkesh Podcast

Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

I mean, I made it like four years ago, over four years ago, almost five now.

4900.332 View full episode →

Dwarkesh Podcast

Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

This was in a time before LLAMS.

4904.999 View full episode →

Dwarkesh Podcast

Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

And I think we learned a lot, actually, since about what potential flaws there might be.

4907.203 View full episode →

Dwarkesh Podcast

Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

I think there is some redundancy.

4912.472 View full episode →

Dwarkesh Podcast

Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

in the set of tasks, which is, of course, against the goals of the benchmark.

4914.014 View full episode →

Dwarkesh Podcast

Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

Every task is supposed to be unique in practice.

4918.502 View full episode →

Dwarkesh Podcast

Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

That's not quite true.

4920.926 View full episode →

Dwarkesh Podcast

Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

I think there's also every task is supposed to be very novel, but in practice, they might not be.

4922.489 View full episode →

Dwarkesh Podcast

Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

They might be structurally similar to something that you might find online somewhere.

4927.958 View full episode →

Dwarkesh Podcast

Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

So we want to keep iterating and release an Arc 2 version later this year.

4932.526 View full episode →

Dwarkesh Podcast

Francois Chollet, Mike Knoop - LLMs won’t lead to AGI - $1,000,000 Prize to find true solution

And I think when we do that, we're going to want to make the old private test set available.

4938.559 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment