Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Aman Sanger

๐Ÿ‘ค Person
1050 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#447 โ€“ Cursor Team: Future of Programming with AI

I mean, I think bigger is certainly better for just raw performance.

Lex Fridman Podcast
#447 โ€“ Cursor Team: Future of Programming with AI

I mean, I think bigger is certainly better for just raw performance.

Lex Fridman Podcast
#447 โ€“ Cursor Team: Future of Programming with AI

And raw intelligence. I think that the path that people might take is, I'm particularly bullish on distillation. And like, yeah, how many knobs can you turn to if we spend like a ton, ton of money on training, like get the most capable, cheap model?

Lex Fridman Podcast
#447 โ€“ Cursor Team: Future of Programming with AI

And raw intelligence. I think that the path that people might take is, I'm particularly bullish on distillation. And like, yeah, how many knobs can you turn to if we spend like a ton, ton of money on training, like get the most capable, cheap model?

Lex Fridman Podcast
#447 โ€“ Cursor Team: Future of Programming with AI

And raw intelligence. I think that the path that people might take is, I'm particularly bullish on distillation. And like, yeah, how many knobs can you turn to if we spend like a ton, ton of money on training, like get the most capable, cheap model?

Lex Fridman Podcast
#447 โ€“ Cursor Team: Future of Programming with AI

like really really caring as much as you can because like the the naive version of caring as much as you can about inference time compute is what people have already done with like the llama models or just over training the shit out of 7b models um on way way way more tokens than essential optimal right but if you really care about it maybe the thing to do is what gamma did which is let's just not let's not just train on tokens let's literally train on uh

Lex Fridman Podcast
#447 โ€“ Cursor Team: Future of Programming with AI

like really really caring as much as you can because like the the naive version of caring as much as you can about inference time compute is what people have already done with like the llama models or just over training the shit out of 7b models um on way way way more tokens than essential optimal right but if you really care about it maybe the thing to do is what gamma did which is let's just not let's not just train on tokens let's literally train on uh

Lex Fridman Podcast
#447 โ€“ Cursor Team: Future of Programming with AI

like really really caring as much as you can because like the the naive version of caring as much as you can about inference time compute is what people have already done with like the llama models or just over training the shit out of 7b models um on way way way more tokens than essential optimal right but if you really care about it maybe the thing to do is what gamma did which is let's just not let's not just train on tokens let's literally train on uh

Lex Fridman Podcast
#447 โ€“ Cursor Team: Future of Programming with AI

minimizing the KL divergence with the distribution of gamma 27B, right? So knowledge distillation there. And you're spending the compute of literally training this 27 billion model, billion parameter model on all these tokens just to get out this, I don't know, smaller model.

Lex Fridman Podcast
#447 โ€“ Cursor Team: Future of Programming with AI

minimizing the KL divergence with the distribution of gamma 27B, right? So knowledge distillation there. And you're spending the compute of literally training this 27 billion model, billion parameter model on all these tokens just to get out this, I don't know, smaller model.

Lex Fridman Podcast
#447 โ€“ Cursor Team: Future of Programming with AI

minimizing the KL divergence with the distribution of gamma 27B, right? So knowledge distillation there. And you're spending the compute of literally training this 27 billion model, billion parameter model on all these tokens just to get out this, I don't know, smaller model.

Lex Fridman Podcast
#447 โ€“ Cursor Team: Future of Programming with AI

Yeah, distillation in theory is... I think getting out more signal from the data that you're training on. And it's like another, it's perhaps another way of getting over, not like completely over, but like partially helping with the data wall where like you only have so much data to train on.

Lex Fridman Podcast
#447 โ€“ Cursor Team: Future of Programming with AI

Yeah, distillation in theory is... I think getting out more signal from the data that you're training on. And it's like another, it's perhaps another way of getting over, not like completely over, but like partially helping with the data wall where like you only have so much data to train on.

Lex Fridman Podcast
#447 โ€“ Cursor Team: Future of Programming with AI

Yeah, distillation in theory is... I think getting out more signal from the data that you're training on. And it's like another, it's perhaps another way of getting over, not like completely over, but like partially helping with the data wall where like you only have so much data to train on.

Lex Fridman Podcast
#447 โ€“ Cursor Team: Future of Programming with AI

Let's like train this really, really big model on all these tokens and we'll distill it into a smaller one. And maybe we can get more signal per token for this much smaller model than we would have originally if we trained it.

Lex Fridman Podcast
#447 โ€“ Cursor Team: Future of Programming with AI

Let's like train this really, really big model on all these tokens and we'll distill it into a smaller one. And maybe we can get more signal per token for this much smaller model than we would have originally if we trained it.

Lex Fridman Podcast
#447 โ€“ Cursor Team: Future of Programming with AI

Let's like train this really, really big model on all these tokens and we'll distill it into a smaller one. And maybe we can get more signal per token for this much smaller model than we would have originally if we trained it.

Lex Fridman Podcast
#447 โ€“ Cursor Team: Future of Programming with AI

Yeah, I think there's a lot of these secrets and details about training these large models that I just don't know and are only privy to the large labs. And the issue is I would waste a lot of that money if I even attempted this because I wouldn't know those things.

Lex Fridman Podcast
#447 โ€“ Cursor Team: Future of Programming with AI

Yeah, I think there's a lot of these secrets and details about training these large models that I just don't know and are only privy to the large labs. And the issue is I would waste a lot of that money if I even attempted this because I wouldn't know those things.

Lex Fridman Podcast
#447 โ€“ Cursor Team: Future of Programming with AI

Yeah, I think there's a lot of these secrets and details about training these large models that I just don't know and are only privy to the large labs. And the issue is I would waste a lot of that money if I even attempted this because I wouldn't know those things.