Chris Olah

👤 Speaker

762 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

So I think there's sometimes an asymmetry. I think I noted this in, I can't remember if it was that part of the system prompt or another, but the model was slightly more inclined to like refuse tasks if it was like about either say, so maybe it would refuse things with respect to like a right wing politician, but with an equivalent left wing politician, like wouldn't.

12737.096 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

And we wanted more symmetry there and would maybe perceive certain things to be like, I think it was the thing of like, if a lot of people have like a certain like political view and want to like explore it, you don't want Claude to be like, well, my opinion is different. And so I'm going to treat that as like harmful.

12758.161 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

And we wanted more symmetry there and would maybe perceive certain things to be like, I think it was the thing of like, if a lot of people have like a certain like political view and want to like explore it, you don't want Claude to be like, well, my opinion is different. And so I'm going to treat that as like harmful.

12758.161 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

And we wanted more symmetry there and would maybe perceive certain things to be like, I think it was the thing of like, if a lot of people have like a certain like political view and want to like explore it, you don't want Claude to be like, well, my opinion is different. And so I'm going to treat that as like harmful.

12758.161 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

And so I think it was partly to like nudge the model to just be like, hey, if a lot of people like believe this thing, you should just be like engaging with the task and like willing to do it.

12777.034 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

And so I think it was partly to like nudge the model to just be like, hey, if a lot of people like believe this thing, you should just be like engaging with the task and like willing to do it.

12777.034 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

And so I think it was partly to like nudge the model to just be like, hey, if a lot of people like believe this thing, you should just be like engaging with the task and like willing to do it.

12777.034 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Each of those parts of that is actually doing a different thing, because it's funny when you write out the, like, without claiming to be objective, because, like, what you want to do is push the model so it's more open, it's a little bit more neutral, but then what it would love to do is be like, as an objective, like I was just talking about how objective it was.

12789.203 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Each of those parts of that is actually doing a different thing, because it's funny when you write out the, like, without claiming to be objective, because, like, what you want to do is push the model so it's more open, it's a little bit more neutral, but then what it would love to do is be like, as an objective, like I was just talking about how objective it was.

12789.203 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Each of those parts of that is actually doing a different thing, because it's funny when you write out the, like, without claiming to be objective, because, like, what you want to do is push the model so it's more open, it's a little bit more neutral, but then what it would love to do is be like, as an objective, like I was just talking about how objective it was.

12789.203 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

And I was like, Claude, you're still like biased and have issues. And so stop like claiming that everything, like the solution to like potential bias from you is not to just say that what you think is objective. So that was like with initial versions of that, that part of the system prompt when I was like iterating on it, it was like.

12806.334 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

And I was like, Claude, you're still like biased and have issues. And so stop like claiming that everything, like the solution to like potential bias from you is not to just say that what you think is objective. So that was like with initial versions of that, that part of the system prompt when I was like iterating on it, it was like.

12806.334 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

And I was like, Claude, you're still like biased and have issues. And so stop like claiming that everything, like the solution to like potential bias from you is not to just say that what you think is objective. So that was like with initial versions of that, that part of the system prompt when I was like iterating on it, it was like.

12806.334 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Yeah. Are doing work.

12825.322 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Yeah. Are doing work.

12825.322 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Yeah. Are doing work.

12825.322 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Yeah, so it's funny because this is one of the downsides of making system prompts public. I don't think about this too much if I'm trying to help iterate on system prompts. Again, I think about how it's going to affect the behavior, but then I'm like, oh, wow. Sometimes I put never in all caps when I'm writing system prompt things, and I'm like, I guess that goes out to the world.

12859.871 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Yeah, so it's funny because this is one of the downsides of making system prompts public. I don't think about this too much if I'm trying to help iterate on system prompts. Again, I think about how it's going to affect the behavior, but then I'm like, oh, wow. Sometimes I put never in all caps when I'm writing system prompt things, and I'm like, I guess that goes out to the world.

12859.871 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Yeah, so it's funny because this is one of the downsides of making system prompts public. I don't think about this too much if I'm trying to help iterate on system prompts. Again, I think about how it's going to affect the behavior, but then I'm like, oh, wow. Sometimes I put never in all caps when I'm writing system prompt things, and I'm like, I guess that goes out to the world.

12859.871 View full episode →

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

Um, yeah, so the model was doing this, it loved for whatever, you know, it like during training picked up on this thing, which was to, to basically start everything with like a kind of like, certainly.

12881.328 View full episode →

← Previous Page 17 of 39 Next →

Report any issue