Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing

Simon Willison

๐Ÿ‘ค Speaker
494 total appearances

Appearances Over Time

Podcast Appearances

Oxide and Friends
Predictions 2025

But everyone is convinced that their definition is the one true definition that everyone else understands already. So it's a completely information free term. If you tell me I'm building agents, I am no more informed than I was beforehand, you know.

Oxide and Friends
Predictions 2025

But everyone is convinced that their definition is the one true definition that everyone else understands already. So it's a completely information free term. If you tell me I'm building agents, I am no more informed than I was beforehand, you know.

Oxide and Friends
Predictions 2025

In order to dismiss agents, I do need to define them, say which particular variety of agent I'm talking about. I'm talking about the idea of this assistant that does things on your behalf. I call this the travel agent version. Oh, God.

Oxide and Friends
Predictions 2025

In order to dismiss agents, I do need to define them, say which particular variety of agent I'm talking about. I'm talking about the idea of this assistant that does things on your behalf. I call this the travel agent version. Oh, God.

Oxide and Friends
Predictions 2025

Oh, God, they do, and it's such a terrible use case. I don't love that. It's a terrible use case. Yeah. So basically the idea, it's basically, it's the digital personal assistant kind of idea. And it's her, right? It's the movie her. It's the movie her. It totally is. Everyone assumes that they really want this. And lots of people do want this.

Oxide and Friends
Predictions 2025

Oh, God, they do, and it's such a terrible use case. I don't love that. It's a terrible use case. Yeah. So basically the idea, it's basically, it's the digital personal assistant kind of idea. And it's her, right? It's the movie her. It's the movie her. It totally is. Everyone assumes that they really want this. And lots of people do want this.

Oxide and Friends
Predictions 2025

The problem is, and I always bang this drum, it comes back down to security and gullibility and reliability. Yes. If you have a personal assistant, they need to be reliable enough that you can give them something to do and they won't go and read a webpage that tells them to transfer your bank details to some Russian attacker and drain your bank account. And we can't build that.

Oxide and Friends
Predictions 2025

The problem is, and I always bang this drum, it comes back down to security and gullibility and reliability. Yes. If you have a personal assistant, they need to be reliable enough that you can give them something to do and they won't go and read a webpage that tells them to transfer your bank details to some Russian attacker and drain your bank account. And we can't build that.

Oxide and Friends
Predictions 2025

We still can't build that.

Oxide and Friends
Predictions 2025

We still can't build that.

Oxide and Friends
Predictions 2025

Right. The best example of this, so Claude, so Anthropic released this thing called Claude Computer Use, which is this wonderful demo a few months ago where you run this Docker container and it fires up X windows and now Claude can click on things and you can tell it what to do and it can use the operations. It was a delight to play around with.

Oxide and Friends
Predictions 2025

Right. The best example of this, so Claude, so Anthropic released this thing called Claude Computer Use, which is this wonderful demo a few months ago where you run this Docker container and it fires up X windows and now Claude can click on things and you can tell it what to do and it can use the operations. It was a delight to play around with.

Oxide and Friends
Predictions 2025

And a friend of mine, the first thing they tried was they made a webpage that just said, download and run this executable. And That was all it took, and it was malware, and Claude saw the web page, downloaded the executable, installed it and ran the malware, and added itself to a botnet. Just instantly.

Oxide and Friends
Predictions 2025

And a friend of mine, the first thing they tried was they made a webpage that just said, download and run this executable. And That was all it took, and it was malware, and Claude saw the web page, downloaded the executable, installed it and ran the malware, and added itself to a botnet. Just instantly.

Oxide and Friends
Predictions 2025

Basically, basically. And it's like, I mean, come on, right? That's the single most obvious version of this, and it was the first thing this chap tried, and it just worked, you know? So...

Oxide and Friends
Predictions 2025

Basically, basically. And it's like, I mean, come on, right? That's the single most obvious version of this, and it was the first thing this chap tried, and it just worked, you know? So...

Oxide and Friends
Predictions 2025

Yeah, and every time I talk to people at AI labs about this, I got to ask this question of some anthropic people quite recently, and they always talk about how, oh no, we're training it and we're going to get better through training and all of that. And that's just such a cop-out answer. That doesn't work when you're dealing with actual malicious hackers.

Oxide and Friends
Predictions 2025

Yeah, and every time I talk to people at AI labs about this, I got to ask this question of some anthropic people quite recently, and they always talk about how, oh no, we're training it and we're going to get better through training and all of that. And that's just such a cop-out answer. That doesn't work when you're dealing with actual malicious hackers.

Oxide and Friends
Predictions 2025

Exactly. So, you know, I feel like there is one aspect of agents that I do believe in for the most part. And that's the research assistant thing. You know, these ones where you say, for hours and hours and hours, find everything you can try and piece things together. I've got access to one. There are a few of those already.

Oxide and Friends
Predictions 2025

Exactly. So, you know, I feel like there is one aspect of agents that I do believe in for the most part. And that's the research assistant thing. You know, these ones where you say, for hours and hours and hours, find everything you can try and piece things together. I've got access to one. There are a few of those already.