Tom Griffiths

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

343 | Tom Griffiths on The Laws of Thought

And so if we can use inductive bias to nudge the models towards more human-like solutions, they're probably going to be things that make a little more sense to us as well.

3823.713 View full episode →

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

343 | Tom Griffiths on The Laws of Thought

Yeah, so some examples of things that we've done in my lab that sort of reveal some of this weirdness.

3850.45 View full episode →

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

343 | Tom Griffiths on The Laws of Thought

One of them is that large language models are very sensitive to the probabilities of the outputs that they're producing, right?

3860.039 View full episode →

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

343 | Tom Griffiths on The Laws of Thought

So when people were very excited about these models, there was the paper, the Sparks of AGI paper that came out that said, GPT-4 sort of exhibits these remarkable abilities.

3867.886 View full episode →

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

343 | Tom Griffiths on The Laws of Thought

Tom McCoy and some colleagues, we wrote a paper that we called Embers of Autoregression, which was saying, as much as you're getting sparks at the top, there are still these embers at the bottom, which are a consequence of the way these models are trained.

3880.437 View full episode →

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

343 | Tom Griffiths on The Laws of Thought

Again, these are things that in modern systems, there's all sorts of tricks that they've used to get around this, but if you take a raw language model of the kind that we were getting with GPT-4,

3896.774 View full episode →

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

343 | Tom Griffiths on The Laws of Thought

and you ask it to solve simple problems like counting the number of letters that appears in a string, how well they do on that is influenced by the probability of the answer that they would have to produce.

3910.087 View full episode →

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

343 | Tom Griffiths on The Laws of Thought

For example, they're much better at counting strings that have 30 letters in them than strings that have 29, because the number 30 appears on the internet more often than the number 29.

3924.57 View full episode →

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

343 | Tom Griffiths on The Laws of Thought

So it's a situation where there are other nearby answers that are pretty good, and some of those have higher probability.

3932.463 View full episode →

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

343 | Tom Griffiths on The Laws of Thought

And so as a consequence, it sort of produces the high probability thing rather than the thing that it's supposed to produce.

3937.278 View full episode →

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

343 | Tom Griffiths on The Laws of Thought

And so that's a...

3942.354 View full episode →

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

343 | Tom Griffiths on The Laws of Thought

weird idiosyncratic bias of language models, it's a consequence of the way that they're trained.

3944.63 View full episode →

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

343 | Tom Griffiths on The Laws of Thought

More generally, the way that I think about these systems is that we should expect, this is applying our computational level lens, we should expect

3952.964 View full episode →

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

343 | Tom Griffiths on The Laws of Thought

intelligent systems to behave in ways that are shaped by the kinds of problems that they're trying to solve.

3966.095 View full episode →

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

343 | Tom Griffiths on The Laws of Thought

And when we design our AI systems, we're making explicit choices about the kinds of problems that they're going to solve, things like being able to predict the next word or token that appears in a sequence.

3971.647 View full episode →

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

343 | Tom Griffiths on The Laws of Thought

And that's going to be something which influences its behavior.

3981.708 View full episode →

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

343 | Tom Griffiths on The Laws of Thought

And so to the extent that there's a difference

3983.913 View full episode →

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

343 | Tom Griffiths on The Laws of Thought

objective function, the goal that we have in training that system and the kinds of computational problems that human minds have evolved to solve, then we're going to expect the kinds of solutions that they find to look quite different.

3986.378 View full episode →

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

343 | Tom Griffiths on The Laws of Thought

And that's part of where we get this mismatch in behavior.

3997.836 View full episode →

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

343 | Tom Griffiths on The Laws of Thought

So the third thread, so we talked about logic and probability theory.

4016.45 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment