Sasha Luccioni
๐ค SpeakerAppearances Over Time
Podcast Appearances
But the thing is, tech companies aren't measuring this stuff.
They're not disclosing it.
And so this is probably only the tip of the iceberg, even if it is a melting one.
And in recent years, we've seen AI models balloon in size, because the current trend in AI is bigger is better.
But please don't get me started on why that's the case.
In any case, we've seen large language models in particular grow 2,000 times in size over the last five years, and of course, their environmental costs are rising as well.
The most recent work I led found that switching out a smaller, more efficient model for a larger language model emits 14 times more carbon for the same task, like telling that knock-knock joke.
And as we're putting in these models into cell phones and search engines and smart fridges and speakers, the environmental costs are really piling up quickly.
So instead of focusing on some future existential risks, let's talk about current tangible impacts and tools we can create to measure and mitigate these impacts.
I helped create CodeCarbon, a tool that runs in parallel to AI training code that estimates the amount of energy it consumes and the amount of carbon it emits.
And using a tool like this can help us make informed choices, like choosing one model over the other because it's more sustainable, or deploying AI models on renewable energy, which can drastically reduce their emissions.
But let's talk about other things, because there's other impacts of AI apart from sustainability.
For example, it's been really hard for artists and authors to prove that their life's work has been used for training AI models without their consent.
And if you want to sue someone, you tend to need proof, right?
So Spawning AI, an organization that was founded by artists, created this really cool tool called Have I Been Trained?
And it lets you search these massive data sets to see what they have on you.
Now, I admit it, I was curious.
I searched Lyon 5B, which is this huge data set of images and text, to see if any images of me were in there.
Now, those two first images, that's me from events I've spoken at.
But the rest of the images, none of those are me.