Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Aman Sanger

πŸ‘€ Speaker
350 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

It's an open research question, one that we're quite interested in. And then there's also uncertainty of like, do you want the model to be the thing that end to end is doing everything, i.e. it's doing the retrieval and its internals and then kind of answering the question, creating the code?

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

Or do you want to separate the retrieval from the frontier model where maybe, you know, you'll get some really capable models that are much better than like the best open source ones in a handful of months? Yeah. And then you'll want to separately train a really good open source model to be the retriever, to be the thing that feeds in the context to these larger models.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

Is this... Yeah, I mean, there are many possible ways you could try doing it. There's certainly no shortage of ideas. It's just a question of going in and trying all of them and being empirical about which one works best. One very naive thing is to try to replicate what's done with VS Code and these frontier models.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

So let's continue pre-training, some kind of continued pre-training that includes general code data, but also throws in a lot of the data of some particular repository that you care about.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

And then in post-training, meaning in, let's just start with instruction fine-tuning, you have like a normal instruction fine-tuning data set about code, but you throw in a lot of questions about code in that repository. So you could either get ground truth ones, which might be difficult, or you could do what you kind of hinted at or suggested using synthetic data, i.e.,

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

kind of having the model ask questions about various pieces of the code. So you kind of take the pieces of the code, then prompt the model or have a model propose a question for that piece of code, and then add those as instruction finds new data points. And then in theory, this might unlock the model's ability to answer questions about that code base.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

I think test time compute is really, really interesting. So there's been the pre-training regime, which will kind of, as you scale up the amount of data and the size of your model, get you better and better performance, both on loss and then on downstream benchmarks and just general performance when we use it for coding or other tasks.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

We're starting to hit a bit of a data wall, meaning it's going to be hard to continue scaling up this regime.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

And so scaling up test time compute is an interesting way of now, you know, increasing the number of inference time flops that we use, but still getting like, like, yeah, as you increase the number of flops use inference time getting corresponding improvements in the performance of these models tremendously.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

Traditionally, we just had to literally train a bigger model that always used that many more flops. But now we could perhaps use the same size model and run it for longer to be able to get an answer at the quality of a much larger model. And so the really interesting thing I like about this is there are some problems that perhaps require

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

hundred trillion parameter model intelligence trained on a hundred trillion tokens. Um, but that's like maybe 1%, maybe like 0.1% of all queries. So are you going to spend all of this effort, all of this compute training model, uh,

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

that costs that much and then run it so infrequently, it feels completely wasteful when instead you get the model that can, that you train the model that's capable of doing the 99.9% of queries, then you have a way of inference time running it longer for those few people that really, really want max intelligence.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

I mean, yeah, that's an open research problem, certainly. I don't think anyone's actually cracked this model routing problem quite well. We'd like to. We have initial implementations of this for something like CursorTab. But at the level of going between 4.0 Sonnet to O1, It's a bit trickier.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

There's also a question of what level of intelligence do you need to determine if the thing is too hard for the four-level model? Maybe you need the O1-level model. It's really unclear.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

Um, well, it's weird because like test time compute, there's like a whole training strategy needed to get test time to compute to work. And the really, the other really weird thing about this is no one like outside of the big labs and maybe even just open AI, no one really knows how it works. Like there've been some really interesting papers that, uh, show hints of what they might be doing.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

And so perhaps they're doing something with tree search using process reward models. But yeah, I just I think the issue is, we don't quite know exactly what it looks like.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

So it would be hard to kind of comment on like where it fits in, I would put it in post training, but maybe like the compute spent for this kind of for getting test time compute to work for a model is going to dwarf pre training eventually.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

It's fun to speculate.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

Yeah. So one thing to do would be, I think you probably need to train a process reward model, which is, so maybe we can get into reward models and outcome reward models versus process reward models. Outcome reward models are the kind of traditional reward models that people are trained for language modeling. And it's just looking at the final thing.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

So if you're doing some math problem, let's look at that final thing you've done, everything, and let's assign a grade to it, how likely we think, like what's the reward for this outcome. Process reward models instead try to grade the chain of thought.