Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Marcus Hutter

๐Ÿ‘ค Speaker
See mentions of this person in podcasts
912 total appearances
Voice ID

Voice Profile Active

This person's voice can be automatically recognized across podcast episodes using AI voice matching.

Voice samples: 1
Confidence: Medium

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

So, for instance, usually you need an aggodicity assumption in the MDP frameworks in order to learn.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

Aggodicity essentially means that you can recover from your mistakes and that there are no traps in the environment.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

And if you make this assumption, then essentially you can, you know, go back to a previous state, go there a couple of times and then learn what statistics and what the state is like and then in the long run perform well in this state.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

But there are no fundamental principles.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

problems but in real life we know you know there can be one single action you know one second of being inattentive while driving a car fast you know can ruin the rest of my life i can become quadriplegic or whatever so and there's no recovery anymore so the real world is not ergodic i always say you know there are traps and there are situations where you are not recovered from and um very little theory has been developed for this case what about uh

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

Yeah, I say the good thing is that there are no parameters to control.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

Some other people drag knobs to control.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

And you can do that.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

I mean, you can modify axes so that you have some knobs to play with if you want to.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

But the exploration is directly baked in.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

And that comes from the Bayesian learning and the long-term planning.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

So these together already imply exploration.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

You can nicely and explicitly prove that

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

for simple problems like so-called banded problems, where you say, to give a real world example, say you have two medical treatments, A and B, you don't know the effectiveness, you try A a little bit, B a little bit, but you don't want to harm too many patients, so you have to sort of trade off exploring, and at some point you want to explore, and you can do the mathematics and figure out the optimal strategy.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

The so-called Bayesian agents are also non-Bayesian agents, but it shows that this Bayesian framework by taking a prior over possible worlds, doing the Bayesian mixture, then the Bayes optimal decision with long-term planning that is important automatically implies exploration also to the proper extent, not too much exploration and not too little in this very simple settings.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

In the ICSI model,

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

I was also able to prove that it is a self-optimizing theorem or asymptotic optimality theorems, although they're only asymptotic, not finite time bounds.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

I think it's absolutely crucial.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

The question is whether there's a way to deal with it in a more heuristic and still sufficiently well way.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

So I have to come up with an example and fly.