Joe Carlsmith
๐ค SpeakerAppearances Over Time
Podcast Appearances
Like that's like a really bad, because like I think one of the clearest things about human processes of reflection
like the kind of easiest thing to be like, let's at least get this is like not, uh, acting on the basis of a, of a incorrect empirical picture of the world.
Right.
And so if you find yourself like asking your way, by the way, like this is true and I need you to always be reasoning as though blah is true.
Um, I'm like, Ooh, I think that's a no, no from an anti-realist perspective too.
Right.
Cause I want to, I want to like my reflective values, I think will be such that I formed them in light of the truth about the world.
And so I think, and I think this is a real concern about as, as we move into this era of kind of aligning AIs, I don't actually think this like binary between like values and other things is going to be a very obvious in how we're training them.
I think it's going to be much more like, um,
Ideologies and like you can just train an AI to like output stuff right output utterances And so you can easily end up in a situation where you like decided that blah is true about some issue an empirical issue right not a moral issue and So like I think I think people should not for example I do not think people should hard code Belief in God into their a eyes or like I would I would advise people to not hard code their religion into their a eyes if they also want to like
Discover if their religion is false I would just in general if you if you would like to have your behavior be sensitive to whether something is true or false like it's sort of generally not good to like Etch it into things and so and so that is definitely a form of blinder I think we should be really watching out for and I'm kind of hopeful so like I have enough credence on some sort of moral realism that like I'm hoping that if we just do the anti realism thing and
of just being consistent, learning all the stuff, reflecting.
If you look at how moral realists and moral anti-realists actually do normative ethics, it's the same.
It's basically the same.
There's some amount of different heuristics on things like properties like simplicity and stuff like that, but I think it's like they're mostly just doing the same game.
And so I'm kind of hoping that, and also metaethics is itself a discipline that AIs can help us with,
I'm hoping that we can just figure this out either way.
So if there is, if moral realism is somehow true, I want us to be able to notice that.
Um, and I want us to be able to like adjust accordingly.
So I'm not like writing off those worlds and be like, let's just like totally assume that's false.