John Collison
π€ SpeakerAppearances Over Time
Podcast Appearances
You have different parts and different subgraphs.
But what's important is that you can propagate and backprop the gradient and the loss function all through the different layers so every layer you can learn
you know, the weights and the representations that matter for the final task.
You don't force it through some, you know, narrow funnel between, let's say, the encoder and the decoder.
Yeah, that's exactly right.
And if, like, this is kind of the basic vanilla version of it, right?
There, if you think about the...
What will it take to build the driver that's capable of fully autonomous operations?
You think about this entire ecosystem of the driver, the simulator, the critic.
If that's all you do, pixels in, trajectories out, it becomes very difficult to do all of those three and achieve the high level of safety and performance that we require.
And it becomes very difficult to kind of do it at scale.
uh and however if you know that's uh it's kind of a very easy way to get started right you collect some data kind of like you know and allows you to the lm world right the easiest thing you can do is have you know uh um you know pick a model uh the easiest way to get started nowadays would be just take a vlm it already has a
a language-aligned camera encoder.
Yep.
And then it has a decoder that, you know, can predict, you know, generate text.
And you can fine-tune it and say, hey, instead of text, generate trajectories.
You know, very, very doable.
In fact, a while ago, we published a paper called AMA that did exactly that.
Yes.
And it will actually...