Roman Yampolsky
๐ค SpeakerAppearances Over Time
Podcast Appearances
So when I wrote a paper, Artificial Intelligence Safety Engineering, which kind of coins the term AI safety, that was 2011. We had 2012 conference, 2013 journal paper. One of the things I proposed, let's just do formal verifications on it. Let's do mathematical formal proofs. In the follow-up work, I basically realized it will still not get us 100%. We can get 99.9.
So when I wrote a paper, Artificial Intelligence Safety Engineering, which kind of coins the term AI safety, that was 2011. We had 2012 conference, 2013 journal paper. One of the things I proposed, let's just do formal verifications on it. Let's do mathematical formal proofs. In the follow-up work, I basically realized it will still not get us 100%. We can get 99.9.
So when I wrote a paper, Artificial Intelligence Safety Engineering, which kind of coins the term AI safety, that was 2011. We had 2012 conference, 2013 journal paper. One of the things I proposed, let's just do formal verifications on it. Let's do mathematical formal proofs. In the follow-up work, I basically realized it will still not get us 100%. We can get 99.9.
We can put more resources exponentially and get closer, but we never get to 100%. If a system makes a billion decisions a second and you use it for 100 years, you're still going to deal with a problem. This is wonderful research. I'm so happy they're doing it. This is great, but it is not going to be a permanent solution to that problem.
We can put more resources exponentially and get closer, but we never get to 100%. If a system makes a billion decisions a second and you use it for 100 years, you're still going to deal with a problem. This is wonderful research. I'm so happy they're doing it. This is great, but it is not going to be a permanent solution to that problem.
We can put more resources exponentially and get closer, but we never get to 100%. If a system makes a billion decisions a second and you use it for 100 years, you're still going to deal with a problem. This is wonderful research. I'm so happy they're doing it. This is great, but it is not going to be a permanent solution to that problem.
There are many, many levels. So first you're verifying the hardware in which it is run. You need to verify communication channel with the human. Every aspect of that whole world model needs to be verified. Somehow it needs to map the world into the world model. Map and territory differences. So how do I know internal states of humans? Are you happy or sad? I can't tell.
There are many, many levels. So first you're verifying the hardware in which it is run. You need to verify communication channel with the human. Every aspect of that whole world model needs to be verified. Somehow it needs to map the world into the world model. Map and territory differences. So how do I know internal states of humans? Are you happy or sad? I can't tell.
There are many, many levels. So first you're verifying the hardware in which it is run. You need to verify communication channel with the human. Every aspect of that whole world model needs to be verified. Somehow it needs to map the world into the world model. Map and territory differences. So how do I know internal states of humans? Are you happy or sad? I can't tell.
So how do I make proofs about real physical world? Yeah, I can verify that deterministic algorithm follows certain properties. That can be done. Some people argue that maybe just maybe two plus two is not four. I'm not that extreme. But once you have sufficiently large proof over sufficiently complex environment, the probability that it has zero bugs in it is greatly reduced.
So how do I make proofs about real physical world? Yeah, I can verify that deterministic algorithm follows certain properties. That can be done. Some people argue that maybe just maybe two plus two is not four. I'm not that extreme. But once you have sufficiently large proof over sufficiently complex environment, the probability that it has zero bugs in it is greatly reduced.
So how do I make proofs about real physical world? Yeah, I can verify that deterministic algorithm follows certain properties. That can be done. Some people argue that maybe just maybe two plus two is not four. I'm not that extreme. But once you have sufficiently large proof over sufficiently complex environment, the probability that it has zero bugs in it is greatly reduced.
If you keep deploying this a lot, eventually you're going to have a bug anyways.
If you keep deploying this a lot, eventually you're going to have a bug anyways.
If you keep deploying this a lot, eventually you're going to have a bug anyways.
There is always a bug. And the fundamental difference is what I mentioned. We're not dealing with cybersecurity. We're not going to get a new credit card, new humanity.
There is always a bug. And the fundamental difference is what I mentioned. We're not dealing with cybersecurity. We're not going to get a new credit card, new humanity.
There is always a bug. And the fundamental difference is what I mentioned. We're not dealing with cybersecurity. We're not going to get a new credit card, new humanity.
You can improve the rate at which you are learning. You can become more efficient meta-optimizer.
You can improve the rate at which you are learning. You can become more efficient meta-optimizer.