Bowen Baker
👤 SpeakerAppearances Over Time
Podcast Appearances
So, yeah, and I think it's probably pretty hard to put in like the proper guardrails on an open source model because someone can like just fine tune those guardrails away once you've released the weights and, you know, yeah.
Yeah.
Um, I will just add a tiny piece of nuance in that, like the current open source models do not seem dangerous and people have done like, like the worst case fine tuning scenarios for the model.
At least we did for the models we released.
And we said, OK, like even if someone super nefarious went and fine tuned this model to like, you know, develop bioweapons, they just couldn't really do it.
Like this model is not smart enough.
And so I think in that regime is totally fine.
Like there's not like the risks aren't high and it's like useful and it's like super useful for the community, for doing research, for making products, whatever they're doing.
And so in that, I think I was more thinking in like the, you know, like the highest power thing of the future.
Yeah.
Yeah.
Yeah, I was just involved in some decisions around whether to leave the... Actually, one interesting thing with the open source model is that we didn't put any pressure on the chain of thought.
We just let it think whatever... If it's going to think something offensive, we let it do that because we wanted it to be...
We wanted developers to be able to, you know, do the same type of chain of thought monitoring that we do if they wanted to in their product.
And then also have like a model that is close to our internal models for external safety researchers is kind of a useful artifact to have out there so that
when they do research and if they do it on our models, maybe we're a bit more convinced that it would apply to like the big, there are bigger models that we have internally give us a bit more confidence to go and implement it and try it ourselves.
So it's a lot of benefits to that.
I would say like a bit of all of that.
It really, it's like, I don't know.
There's no, it's not extremely formal.