Nathan Lambert
๐ค SpeakerAppearances Over Time
Podcast Appearances
And there's an interesting aspect of just because it's open-weighted or open-sourced doesn't mean it can't be subverted, right? There have been many open-source software bugs that have been like... For example, there was a Linux bug that was found after 10 years, which was clearly a backdoor because somebody was like, why is this taking half a second to load? This is the recent one.
Why is this taking half a second to load? And it was like, oh crap, there's a backdoor here. That's why. And it's like, this is very much possible with AI models. Today, the alignment of these models is very clear. I'm not going to say bad words. I'm not going to teach you how to make Anthrax. I'm not going to talk about Tiananmen Square.
Why is this taking half a second to load? And it was like, oh crap, there's a backdoor here. That's why. And it's like, this is very much possible with AI models. Today, the alignment of these models is very clear. I'm not going to say bad words. I'm not going to teach you how to make Anthrax. I'm not going to talk about Tiananmen Square.
Why is this taking half a second to load? And it was like, oh crap, there's a backdoor here. That's why. And it's like, this is very much possible with AI models. Today, the alignment of these models is very clear. I'm not going to say bad words. I'm not going to teach you how to make Anthrax. I'm not going to talk about Tiananmen Square.
I'm not going to, you know, things like, I'm going to say Taiwan is part of, you know, is, is just an Eastern province, right? Like, you know, all these things are like, depending on who you are, what you align, what, you know, whether, you know, and even like XAI is aligned a certain way, right? You know, there, they might be, it's not aligned in the like woke sense.
I'm not going to, you know, things like, I'm going to say Taiwan is part of, you know, is, is just an Eastern province, right? Like, you know, all these things are like, depending on who you are, what you align, what, you know, whether, you know, and even like XAI is aligned a certain way, right? You know, there, they might be, it's not aligned in the like woke sense.
I'm not going to, you know, things like, I'm going to say Taiwan is part of, you know, is, is just an Eastern province, right? Like, you know, all these things are like, depending on who you are, what you align, what, you know, whether, you know, and even like XAI is aligned a certain way, right? You know, there, they might be, it's not aligned in the like woke sense.
It's not aligned in like pro China sense, but there is certain things that are imbued within the model. Now, when you release this publicly in an instruct model, that's open weights, um, this can then proliferate, right? But as these systems get more and more capable, what you can embed deep down in the model is not as clear, right?
It's not aligned in like pro China sense, but there is certain things that are imbued within the model. Now, when you release this publicly in an instruct model, that's open weights, um, this can then proliferate, right? But as these systems get more and more capable, what you can embed deep down in the model is not as clear, right?
It's not aligned in like pro China sense, but there is certain things that are imbued within the model. Now, when you release this publicly in an instruct model, that's open weights, um, this can then proliferate, right? But as these systems get more and more capable, what you can embed deep down in the model is not as clear, right?
And so that is like one of the big fears is like if an American model or a Chinese model is the top model, right, you're going to embed things that are unclear. And it can be unintentional too, right? Like British English is dead because American LLMs won, right? And the internet is American and therefore like color is spelled the way Americans spell it, right?
And so that is like one of the big fears is like if an American model or a Chinese model is the top model, right, you're going to embed things that are unclear. And it can be unintentional too, right? Like British English is dead because American LLMs won, right? And the internet is American and therefore like color is spelled the way Americans spell it, right?
And so that is like one of the big fears is like if an American model or a Chinese model is the top model, right, you're going to embed things that are unclear. And it can be unintentional too, right? Like British English is dead because American LLMs won, right? And the internet is American and therefore like color is spelled the way Americans spell it, right?
This is just- This is just the factual nature of the LLS now.
This is just- This is just the factual nature of the LLS now.
This is just- This is just the factual nature of the LLS now.
It is. Taking it as something silly, right? Like something as silly as the spelling, like which British and English, you know, Brits and Americans will like laugh about probably, right? I don't think we care that much. But like, you know, some people will, but like this can, this can boil down into like very, very important topics. Like, Hey, you know, subverting people, right?
It is. Taking it as something silly, right? Like something as silly as the spelling, like which British and English, you know, Brits and Americans will like laugh about probably, right? I don't think we care that much. But like, you know, some people will, but like this can, this can boil down into like very, very important topics. Like, Hey, you know, subverting people, right?
It is. Taking it as something silly, right? Like something as silly as the spelling, like which British and English, you know, Brits and Americans will like laugh about probably, right? I don't think we care that much. But like, you know, some people will, but like this can, this can boil down into like very, very important topics. Like, Hey, you know, subverting people, right?
You know, chatbots, right? Character AI has shown that they can, like, you know, talk to kids or adults. And, like, it will, like, people feel a certain way, right? And that's unintentional alignment. But, like, what happens when there's intentional alignment deep down on the open source standard? It's a backdoor today for, like, Linux, right?