Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Aman Sanger

πŸ‘€ Speaker
350 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

Yeah, I mean, so GitHub tries to solve this, right, with code review. When you're doing code review, you're reviewing multiple diffs across multiple files. But like Arvid said earlier, I think you can do much better than code review. You know, code review kind of sucks. Like, you spend a lot of time trying to grok this code that's often quite unfamiliar to you, and...

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

it often doesn't even actually catch that many bugs. And I think you can significantly improve that review experience using language models, for example, using the kinds of tricks that Arvind had described of maybe pointing you towards the regions that actually matter.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

I think also if the code is produced by these language models and it's not produced by someone else, like the code review experience is designed for both the reviewer and the person that produced the code. In the case where the person that produced the code is the language model, You don't have to care that much about their experience.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

And you can design the entire thing around the reviewers such that the reviewer's job is as fun, as easy, as productive as possible. And I think that feels like the issue with just kind of naively trying to make these things look like code review. I think you can be a lot more creative and push the boundary on what's possible.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

Well, Cursor really works via this ensemble of custom models that we've trained alongside the frontier models that are fantastic at the reasoning intense things. And so CursorTab, for example, is a great example of where you can specialize this model to be even better than even frontier models if you look at evals on the task we set it at.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

The other domain, which it's kind of surprising that it requires custom models, but it's kind of necessary and works quite well, is in apply. So I think these models are like the frontier models are quite good at sketching out plans for code and generating like rough sketches of like the change. But actually, Creating diffs is quite hard for frontier models, for your training models.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

You try to do this with Sonnet, with O1, any frontier model, and it really messes up stupid things like counting line numbers, especially in super, super large files. And so what we've done to alleviate this is we let the model kind of sketch out this rough code block that indicates what the change will be. And we train a model to then apply that change to the file.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

Yeah. I think like you see shallow copies of apply, um, elsewhere and it just breaks like most of the time, because you think you can kind of try to do some deterministic matching and then it fails, you know, at least 40% of the time. And that just results in a terrible product experience. Um, I think in general, this regime of you are going to get smarter and smarter models.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

So one other thing that Apply lets you do is it lets you use fewer tokens with the most intelligent models. This is both expensive in terms of latency for generating all these tokens and cost. So you can give this very, very rough sketch and then have your small models go and implement it because it's a much easier task to implement this very, very sketched out code.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

And I think that this regime will continue where you can use smarter and smarter models to do the planning and then maybe the implementation details can be handled by the less intelligent ones. Perhaps you'll have, you know, maybe O1, maybe it'll be even more capable models given an even higher level plan that is kind of recursively implemented applied by Sonnet and an Eply model.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

Fast is always an interesting detail. Fast is good.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

Yeah, so one big component of making it fast is speculative edits. So speculative edits are a variant of speculative decoding. And maybe it'd be helpful to briefly describe speculative decoding. With speculative decoding, what you do is you can kind of take advantage of the fact that most of the time, and I'll add the caveat that it would be when you're memory bound in language model generation.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

If you... process multiple tokens at once, it is faster than generating one token at a time. So this is the same reason why if you look at tokens per second with prompt tokens versus generated tokens, it's much, much faster for prompt tokens.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

So what we do is instead of using what speculative decoding normally does, which is using a really small model to predict these draft tokens that your larger model will then go in and verify, With code edits, we have a very strong prior of what the existing code will look like. And that prior is literally the same exact code.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

So what you can do is you can just feed chunks of the original code back into the model. And then the model will just pretty much agree most of the time that, okay, I'm just going to spit this code back out. And so you can process all of those lines in parallel. And you just do this with sufficiently many chunks. And then eventually you'll reach a point of disagreement.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

where the model will now predict text that is different from the ground truth original code. It'll generate those tokens, and then we kind of will decide after enough tokens match the original code to restart speculating in chunks of code. What this actually ends up looking like is just a much faster version of normal editing code.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

So it looks like a much faster version of the model rewriting all the code. So we can use the same exact interface, that we use for diffs, but it will just stream down a lot faster.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

Yeah, I think there's no model that... Grado dominates others, meaning it is better in all categories that we think matter. The categories being speed, ability to edit code, ability to process lots of code, long context, you know, a couple of other things and kind of coding capabilities. The one that I'd say right now is just kind of net best is Sonnet. I think this is a consensus opinion.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

Our one's really interesting and it's really good at reasoning. So if you give it really hard, uh, programming interview style problems or lead code problems. It can do quite, quite well on them. But it doesn't feel like it kind of understands your rough intent as well as Sonnet does.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

Like, if you look at a lot of the other frontier models, one qualm I have is it feels like they're not necessarily over, I'm not saying they train on benchmarks. But they perform really well in benchmarks relative to kind of everything that's kind of in the middle.