Aman Sanger

#447 – Cursor Team: Future of Programming with AI

So if you try it in all these benchmarks and things that are in the distribution of the benchmarks they're evaluated on, you know, they'll do really well. But when you push them a little bit outside of that, Sonnet's I think the one that kind of does best at kind of maintaining that same capability.

2804.149 View full episode →

Lex Fridman Podcast

#447 – Cursor Team: Future of Programming with AI

So if you try it in all these benchmarks and things that are in the distribution of the benchmarks they're evaluated on, you know, they'll do really well. But when you push them a little bit outside of that, Sonnet's I think the one that kind of does best at kind of maintaining that same capability.

2804.149 View full episode →

Lex Fridman Podcast

#447 – Cursor Team: Future of Programming with AI

So if you try it in all these benchmarks and things that are in the distribution of the benchmarks they're evaluated on, you know, they'll do really well. But when you push them a little bit outside of that, Sonnet's I think the one that kind of does best at kind of maintaining that same capability.

2804.149 View full episode →

Lex Fridman Podcast

#447 – Cursor Team: Future of Programming with AI

Like you kind of have the same capability in the benchmark as when you try to instruct it to do anything with coding.

2817.795 View full episode →

Lex Fridman Podcast

#447 – Cursor Team: Future of Programming with AI

Like you kind of have the same capability in the benchmark as when you try to instruct it to do anything with coding.

2817.795 View full episode →

Lex Fridman Podcast

#447 – Cursor Team: Future of Programming with AI

Like you kind of have the same capability in the benchmark as when you try to instruct it to do anything with coding.

2817.795 View full episode →

Lex Fridman Podcast

#447 – Cursor Team: Future of Programming with AI

Yeah, like in that case, it could be trained on the literal issues or pull requests themselves. And maybe the labs will start to do a better job, or they've already done a good job at decontaminating those things. But they're not going to emit the actual training data of the repository itself. Like these are all like some of the most popular Python repositories, like SymPy is one example.

2959.731 View full episode →

Lex Fridman Podcast

#447 – Cursor Team: Future of Programming with AI

Yeah, like in that case, it could be trained on the literal issues or pull requests themselves. And maybe the labs will start to do a better job, or they've already done a good job at decontaminating those things. But they're not going to emit the actual training data of the repository itself. Like these are all like some of the most popular Python repositories, like SymPy is one example.

2959.731 View full episode →

Lex Fridman Podcast

#447 – Cursor Team: Future of Programming with AI

Yeah, like in that case, it could be trained on the literal issues or pull requests themselves. And maybe the labs will start to do a better job, or they've already done a good job at decontaminating those things. But they're not going to emit the actual training data of the repository itself. Like these are all like some of the most popular Python repositories, like SymPy is one example.

2959.731 View full episode →

Lex Fridman Podcast

#447 – Cursor Team: Future of Programming with AI

I don't think they're going to handicap their models on SymPy and all these popular Python repositories in order to get true evaluation scores in these benchmarks.

2981.096 View full episode →

Lex Fridman Podcast

#447 – Cursor Team: Future of Programming with AI

I don't think they're going to handicap their models on SymPy and all these popular Python repositories in order to get true evaluation scores in these benchmarks.

2981.096 View full episode →

Lex Fridman Podcast

#447 – Cursor Team: Future of Programming with AI

I don't think they're going to handicap their models on SymPy and all these popular Python repositories in order to get true evaluation scores in these benchmarks.

2981.096 View full episode →

Lex Fridman Podcast

#447 – Cursor Team: Future of Programming with AI

Yeah, with Claude, there's an interesting take I heard where I think AWS has different chips. And I suspect they have slightly different numerics than NVIDIA GPUs. And someone speculated that Claude's degraded performance had to do with maybe using the quantized version that existed on AWS Bedrock versus whatever was running on Anthropix GPUs.

3060.356 View full episode →

Lex Fridman Podcast

#447 – Cursor Team: Future of Programming with AI

Yeah, with Claude, there's an interesting take I heard where I think AWS has different chips. And I suspect they have slightly different numerics than NVIDIA GPUs. And someone speculated that Claude's degraded performance had to do with maybe using the quantized version that existed on AWS Bedrock versus whatever was running on Anthropix GPUs.

3060.356 View full episode →

Lex Fridman Podcast

#447 – Cursor Team: Future of Programming with AI

Yeah, with Claude, there's an interesting take I heard where I think AWS has different chips. And I suspect they have slightly different numerics than NVIDIA GPUs. And someone speculated that Claude's degraded performance had to do with maybe using the quantized version that existed on AWS Bedrock versus whatever was running on Anthropix GPUs.

3060.356 View full episode →

Lex Fridman Podcast

#447 – Cursor Team: Future of Programming with AI

That's amazing. And you can do, like, other fancy things where if you have lots of code blocks from the entire code base, you could use retrieval and things like embedding and re-ranking scores to add priorities for each of these components.

3343.416 View full episode →

Lex Fridman Podcast

#447 – Cursor Team: Future of Programming with AI

That's amazing. And you can do, like, other fancy things where if you have lots of code blocks from the entire code base, you could use retrieval and things like embedding and re-ranking scores to add priorities for each of these components.

3343.416 View full episode →

Lex Fridman Podcast

#447 – Cursor Team: Future of Programming with AI

That's amazing. And you can do, like, other fancy things where if you have lots of code blocks from the entire code base, you could use retrieval and things like embedding and re-ranking scores to add priorities for each of these components.

3343.416 View full episode →

Lex Fridman Podcast

#447 – Cursor Team: Future of Programming with AI

I think even as the system gets closer to some level of perfection, Often when you ask the model for something, not enough intent is conveyed to know what to do. And there are a few ways to resolve that intent. One is the simple thing of having the model just ask you, I'm not sure how to do these parts based on your query. Could you clarify that? I think the other could be maybe...

3425.322 View full episode →

Lex Fridman Podcast

#447 – Cursor Team: Future of Programming with AI

I think even as the system gets closer to some level of perfection, Often when you ask the model for something, not enough intent is conveyed to know what to do. And there are a few ways to resolve that intent. One is the simple thing of having the model just ask you, I'm not sure how to do these parts based on your query. Could you clarify that? I think the other could be maybe...

3425.322 View full episode →

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment