Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Marcus Hutter

๐Ÿ‘ค Speaker
See mentions of this person in podcasts
912 total appearances
Voice ID

Voice Profile Active

This person's voice can be automatically recognized across podcast episodes using AI voice matching.

Voice samples: 1
Confidence: Medium

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

And that is a very interesting question.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

And I'm asked a lot about this question.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

Where do the rewards come from?

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

And that depends.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

And I give you now a couple of answers.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

So if you want to build agents,

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

Now let's start simple.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

So let's assume we want to build an agent based on the Aixi model, which performs a particular task.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

Let's start with something super simple like playing chess or Go or something.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

Then the reward is winning the game is plus 1, losing the game is minus 1, done.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

You apply this agent.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

If you have enough compute, you let it self-play, and it will learn the rules of the game, will play perfect chess.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

After some while, problem solved.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

So if you have more complicated problems, then you may believe that you have the right reward, but it's not.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

So a nice, cute example is Elevator Control.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

That is also in Rich Sutton's book, which is a great book, by the way.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

So you control the elevator and you think, well, maybe the reward should be coupled to how long people wait in front of the elevator.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

You know, long wait is bad.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

You program it and you do it.

Lex Fridman Podcast
#75 โ€“ Marcus Hutter: Universal Artificial Intelligence, AIXI, and AGI

And what happens is the elevator eagerly picks up all the people but never drops them off.