Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Aman Sanger

๐Ÿ‘ค Speaker
1050 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Our one's really interesting and it's really good at reasoning. So if you give it really hard, uh, programming interview style problems or lead code problems. It can do quite, quite well on them. But it doesn't feel like it kind of understands your rough intent as well as Sonnet does.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Our one's really interesting and it's really good at reasoning. So if you give it really hard, uh, programming interview style problems or lead code problems. It can do quite, quite well on them. But it doesn't feel like it kind of understands your rough intent as well as Sonnet does.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Like, if you look at a lot of the other frontier models, one qualm I have is it feels like they're not necessarily over, I'm not saying they train on benchmarks. But they perform really well in benchmarks relative to kind of everything that's kind of in the middle.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Like, if you look at a lot of the other frontier models, one qualm I have is it feels like they're not necessarily over, I'm not saying they train on benchmarks. But they perform really well in benchmarks relative to kind of everything that's kind of in the middle.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Like, if you look at a lot of the other frontier models, one qualm I have is it feels like they're not necessarily over, I'm not saying they train on benchmarks. But they perform really well in benchmarks relative to kind of everything that's kind of in the middle.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

So if you try it in all these benchmarks and things that are in the distribution of the benchmarks they're evaluated on, you know, they'll do really well. But when you push them a little bit outside of that, Sonnet's I think the one that kind of does best at kind of maintaining that same capability.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

So if you try it in all these benchmarks and things that are in the distribution of the benchmarks they're evaluated on, you know, they'll do really well. But when you push them a little bit outside of that, Sonnet's I think the one that kind of does best at kind of maintaining that same capability.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

So if you try it in all these benchmarks and things that are in the distribution of the benchmarks they're evaluated on, you know, they'll do really well. But when you push them a little bit outside of that, Sonnet's I think the one that kind of does best at kind of maintaining that same capability.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Like you kind of have the same capability in the benchmark as when you try to instruct it to do anything with coding.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Like you kind of have the same capability in the benchmark as when you try to instruct it to do anything with coding.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Like you kind of have the same capability in the benchmark as when you try to instruct it to do anything with coding.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Yeah, like in that case, it could be trained on the literal issues or pull requests themselves. And maybe the labs will start to do a better job, or they've already done a good job at decontaminating those things. But they're not going to emit the actual training data of the repository itself. Like these are all like some of the most popular Python repositories, like SymPy is one example.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Yeah, like in that case, it could be trained on the literal issues or pull requests themselves. And maybe the labs will start to do a better job, or they've already done a good job at decontaminating those things. But they're not going to emit the actual training data of the repository itself. Like these are all like some of the most popular Python repositories, like SymPy is one example.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Yeah, like in that case, it could be trained on the literal issues or pull requests themselves. And maybe the labs will start to do a better job, or they've already done a good job at decontaminating those things. But they're not going to emit the actual training data of the repository itself. Like these are all like some of the most popular Python repositories, like SymPy is one example.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

I don't think they're going to handicap their models on SymPy and all these popular Python repositories in order to get true evaluation scores in these benchmarks.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

I don't think they're going to handicap their models on SymPy and all these popular Python repositories in order to get true evaluation scores in these benchmarks.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

I don't think they're going to handicap their models on SymPy and all these popular Python repositories in order to get true evaluation scores in these benchmarks.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Yeah, with Claude, there's an interesting take I heard where I think AWS has different chips. And I suspect they have slightly different numerics than NVIDIA GPUs. And someone speculated that Claude's degraded performance had to do with maybe using the quantized version that existed on AWS Bedrock versus whatever was running on Anthropix GPUs.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Yeah, with Claude, there's an interesting take I heard where I think AWS has different chips. And I suspect they have slightly different numerics than NVIDIA GPUs. And someone speculated that Claude's degraded performance had to do with maybe using the quantized version that existed on AWS Bedrock versus whatever was running on Anthropix GPUs.

Lex Fridman Podcast
#446 โ€“ Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Yeah, with Claude, there's an interesting take I heard where I think AWS has different chips. And I suspect they have slightly different numerics than NVIDIA GPUs. And someone speculated that Claude's degraded performance had to do with maybe using the quantized version that existed on AWS Bedrock versus whatever was running on Anthropix GPUs.