Chris Olah
๐ค SpeakerAppearances Over Time
Podcast Appearances
And so it's like this desire for extreme clarity. So it's like anyone could just pick up your paper, read it and know exactly what you're talking about. It's why it can almost be kind of dry. Like all of the terms are defined. Every objection's kind of gone through methodically.
And so it's like this desire for extreme clarity. So it's like anyone could just pick up your paper, read it and know exactly what you're talking about. It's why it can almost be kind of dry. Like all of the terms are defined. Every objection's kind of gone through methodically.
And so it's like this desire for extreme clarity. So it's like anyone could just pick up your paper, read it and know exactly what you're talking about. It's why it can almost be kind of dry. Like all of the terms are defined. Every objection's kind of gone through methodically.
And it makes sense to me because I'm like, when you're in such an a priori domain, clarity is sort of this way that you can prevent people from just kind of making stuff up. And I think that's sort of what you have to do with language models. Like very often I actually find myself doing sort of mini versions of philosophy. You know, so I'm like, suppose that you give me a task.
And it makes sense to me because I'm like, when you're in such an a priori domain, clarity is sort of this way that you can prevent people from just kind of making stuff up. And I think that's sort of what you have to do with language models. Like very often I actually find myself doing sort of mini versions of philosophy. You know, so I'm like, suppose that you give me a task.
And it makes sense to me because I'm like, when you're in such an a priori domain, clarity is sort of this way that you can prevent people from just kind of making stuff up. And I think that's sort of what you have to do with language models. Like very often I actually find myself doing sort of mini versions of philosophy. You know, so I'm like, suppose that you give me a task.
I have a task for the model and I want it to like pick out a certain kind of question or identify whether an answer has a certain property. Like I'll actually sit and be like, let's just give this a name, this property. So like, you know, suppose I'm trying to tell it like, oh, I want you to identify whether this response was rude or polite.
I have a task for the model and I want it to like pick out a certain kind of question or identify whether an answer has a certain property. Like I'll actually sit and be like, let's just give this a name, this property. So like, you know, suppose I'm trying to tell it like, oh, I want you to identify whether this response was rude or polite.
I have a task for the model and I want it to like pick out a certain kind of question or identify whether an answer has a certain property. Like I'll actually sit and be like, let's just give this a name, this property. So like, you know, suppose I'm trying to tell it like, oh, I want you to identify whether this response was rude or polite.
I'm like, that's a whole philosophical question in and of itself. So I have to do as much philosophy as I can in the moment to be like, here's what I mean by rudeness and here's what I mean by politeness. And then there's another element that's a bit more, I guess... I don't know if this is scientific or empirical. I think it's empirical.
I'm like, that's a whole philosophical question in and of itself. So I have to do as much philosophy as I can in the moment to be like, here's what I mean by rudeness and here's what I mean by politeness. And then there's another element that's a bit more, I guess... I don't know if this is scientific or empirical. I think it's empirical.
I'm like, that's a whole philosophical question in and of itself. So I have to do as much philosophy as I can in the moment to be like, here's what I mean by rudeness and here's what I mean by politeness. And then there's another element that's a bit more, I guess... I don't know if this is scientific or empirical. I think it's empirical.
So I take that description and then what I want to do is, again, probe the model many times. Prompting is very iterative. I think a lot of people, if a prompt is important, they'll iterate on it hundreds or thousands of times. And so you give it the instructions and then I'm like, what are the edge cases?
So I take that description and then what I want to do is, again, probe the model many times. Prompting is very iterative. I think a lot of people, if a prompt is important, they'll iterate on it hundreds or thousands of times. And so you give it the instructions and then I'm like, what are the edge cases?
So I take that description and then what I want to do is, again, probe the model many times. Prompting is very iterative. I think a lot of people, if a prompt is important, they'll iterate on it hundreds or thousands of times. And so you give it the instructions and then I'm like, what are the edge cases?
So if I looked at this, so I try and like almost like, you know, see myself from the position of the model and be like, what is the exact case that I would misunderstand or where I would just be like, I don't know what to do in this case. And then I give that case to the model and I see how it responds. And if I think I got it wrong, I add more instructions or even add that in as an example.
So if I looked at this, so I try and like almost like, you know, see myself from the position of the model and be like, what is the exact case that I would misunderstand or where I would just be like, I don't know what to do in this case. And then I give that case to the model and I see how it responds. And if I think I got it wrong, I add more instructions or even add that in as an example.
So if I looked at this, so I try and like almost like, you know, see myself from the position of the model and be like, what is the exact case that I would misunderstand or where I would just be like, I don't know what to do in this case. And then I give that case to the model and I see how it responds. And if I think I got it wrong, I add more instructions or even add that in as an example.
So these very like taking the examples that are right at the edge of what you want and don't want. and putting those into your prompt as like an additional kind of way of describing the thing. And so yeah, in many ways, it just feels like this mix of like, it's really just trying to do clear exposition. And I think I do that because that's how I get clear on things myself.
So these very like taking the examples that are right at the edge of what you want and don't want. and putting those into your prompt as like an additional kind of way of describing the thing. And so yeah, in many ways, it just feels like this mix of like, it's really just trying to do clear exposition. And I think I do that because that's how I get clear on things myself.