John Collison
π€ SpeakerAppearances Over Time
Podcast Appearances
And then you distill.
Each one of the teachers kind of distills, trains its own student, the driver, the simulator, the critic.
Technology breakthroughs that happened over the years were critically important, primarily in AI, but also in other areas like, you know, compute, heavy compute.
Now, I wouldn't characterize it as like going, you know,
a thousand different dead ends and then having to retract and then finding like the one right path, I would characterize it as iterative learning and evolution.
And then, you know, transformers came around.
But transformers, for example, are very general architecture and powers of LMS powers, you know, our models.
but how you apply them to that space.
I think this is where... You didn't just fall out of Transformers.
Exactly, right?
And of course, people like to talk about architectures, but architecture is important, but really a lot of it comes down primarily to your metrics, to your evaluation mechanisms, to all of the training recipes, and of course, data.
Yes.
So this is where I think we can get a bit into the...
this question of what is the interface between the encoder and the decoder parts.
And I think that touches also on the thing you flagged earlier where people like to debate end-to-end or not end-to-end.
So the way
Let's talk a little bit about end-to-end and then get back to what is the interface between those two, right?
So when we say end-to-end, what do we mean?
We mean that it is some large ML model.
Typically, you don't build them monolithically.