Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Alex Reisner

πŸ‘€ Speaker
153 total appearances
Voice ID

Voice Profile Active

This person's voice can be automatically recognized across podcast episodes using AI voice matching.

Voice samples: 1
Confidence: Medium

Appearances Over Time

Podcast Appearances

The Vergecast
How to train your data

Hey, David.

The Vergecast
How to train your data

Thank you for having me.

The Vergecast
How to train your data

I think it is potentially the most important aspect of a model is what it's trained on.

The Vergecast
How to train your data

I mean, if you take a model and you train it on

The Vergecast
How to train your data

Let's say it generates music and you train it on 1950s jazz.

The Vergecast
How to train your data

That model will be very good at generating music that sounds a lot like 1950s jazz.

The Vergecast
How to train your data

If you train it on recent hip-hop, it's going to generate music that sounds like recent hip-hop.

The Vergecast
How to train your data

These models have names like ChatGPT and Claude, but I think you could make an argument that

The Vergecast
How to train your data

The right name for a model is actually the description of the data it was trained on, because that is a description of its capabilities.

The Vergecast
How to train your data

That's what it can output.

The Vergecast
How to train your data

And so I think that the training data is really fundamental to the model, maybe more than the architecture to some degree.

The Vergecast
How to train your data

Yeah, I mean, the companies...

The Vergecast
How to train your data

have argued that they need to keep this secret because the data that they have selected to train on is their competitive advantage, right?

The Vergecast
How to train your data

Like Anthropic has done a better job at selecting data than Google and OpenAI.

The Vergecast
How to train your data

And if they were to let that come out,

The Vergecast
How to train your data

in a court case or be public in some way, they would lose their competitive advantage.

The Vergecast
How to train your data

There's another pretty obvious reason, which is that they have gone about acquiring a lot of this data in ways that the people who've created the data, the authors of the books and the creators of the videos and the music would not be happy about.

The Vergecast
How to train your data

And in a lot of cases, they just don't know that their work is being used.

The Vergecast
How to train your data

when they find out they're not happy about it.

The Vergecast
How to train your data

And I think it's a conversation that the AI companies have tried to just avoid having.

← Previous Page 1 of 8 Next β†’