Robert M
๐ค SpeakerAppearances Over Time
Podcast Appearances
The most likely outcome of a data breach is that the database is scanned via automated tooling for anything that looks like account credentials, crypto wallet keys, LLM inference provider API keys, or similar.
If you have ever stored anything like that in a draft post or sent it to another user via LessWrongDM, I recommend cycling it immediately.
It is possible that for example an individual with a grudge might try to dig up dirt on their enemies.
I think this is a pretty unlikely threat model even if it becomes tractable for a random person to point an LLM at LessWrong and say hack that.
In that world, I do expect us, the LessWrong team, to clean up most of the issues obvious to publicly available LLMs relatively quickly and also most people with grudges don't commit cybercrime about it.
Another possibility is that we get hit by an untargeted attack and all the data is released in a public data dump.
It's hard to get good numbers for this kind of thing, but there's a few reasons for optimism here.
From what I could find, probably well under half of data breaches result in datasets that get publicly circulated in any meaningful sense.
Many of those that do are for sale, not freely available.
Someone with a chip on their shoulder might download a freely available dataset, but is much less likely to spend money on it and also risk the eye of the state if they then try to use that purchased data for anything untoward.
Datasets like this often don't ever really go away, but they often do become unavailable, especially if they're large.
Storage is expensive, hosting sites generally take them down on request, torrenting is risky, and there isn't much motive to keep re-uploading terabytes of data that you aren't even selling.
Monetizable datasets tend to be stripped down and much smaller, but also wouldn't include approximately any of the information that you might be concerned about here.
Subheading.
FAQ.
There are three details boxes here, which are omitted from this narration.
The three boxes have the titles what private data of mine could be exposed in a breach, can I delete my data, and is less wrong planning on changing anything.
Heading.
The broader situation.
Epistemic status.