Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Tom Griffiths

πŸ‘€ Speaker
539 total appearances

Appearances Over Time

Podcast Appearances

So when you think about Bayes' rule as our tool for describing what ideal solutions to inductive problems look like, that characterization applies both to an inference that you might make

in perception, when you're trying to interpret the light that's falling on your retina, your brain has to do something that looks like an inductive inference to figure out the structure of the world out there, to interpreting a sentence that somebody says, where you're taking the words that you hear or the sound that's hitting your eardrum and turning that into an inference about what it is that the person said and maybe what they meant, but also to

you know, like fundamental things like how do we learn language in the first place?

And how is it that brains, you know, come to be able to interpret the structure of the physical world around us, right?

So all of those things are things you can think about as inductive problems.

And so asking where the priors come from is going to be different in those different cases, right?

So the more fundamental case, the one which is about like, you know, how do we learn language if we think about that as a problem of inductive inference,

The priors there are going to reflect whatever the innate predispositions we have to learn language, but also all of the other sources of information that we have that are not the linguistic input.

So the experience that we have in the world and so on is stuff that's going to inform the way that we learn language from the utterances that we hear.

And so that is a good tool for using for thinking about what are differences between humans and large language models, where the big difference between human minds and brains and large language models that we have today is about inductive bias.

It's about being able to learn from the small amounts of data

that we get as humans relative to the very large amounts of data that our large language models are trained on.

So a human child learns to use language in about five years of exposure.

By comparison, the data used to train large language models is the equivalent of between 5,000 and 50,000 years of continuous speech.

So it's just orders of magnitude difference.

And the thing that makes up that gap is inductive bias.

It's the thing that comes from our

prior distributions broadly construed as human beings that allows us to close that gap.

When we look at the everyday inferences that we make, these short-term things like interpreting a sentence or making sense of visual information,

those priors are things that are really a consequence of those learning processes having worked.