Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing

Aman Sanger

👤 Person
1050 total appearances

Appearances Over Time

Podcast Appearances

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

I feel like I have much more to do there. It felt like the path to get to IMO was a little bit more clear because it already could get a few IMO problems. And there are a bunch of like there's a bunch of low hanging fruit given the literature at the time of like what what tactics people could take. I think I'm one much less versed in the space of theorem proving now.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

And two, yeah, less intuition about how close we are to solving these really, really hard open problems.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

And two, yeah, less intuition about how close we are to solving these really, really hard open problems.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

And two, yeah, less intuition about how close we are to solving these really, really hard open problems.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

I think we might get feels metal before AGI.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

I think we might get feels metal before AGI.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

I think we might get feels metal before AGI.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

I think it's interesting. The original scaling laws paper by OpenAI was slightly wrong because I think of some issues they did with learning rate schedules. And then Chinchilla showed a more correct version.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

I think it's interesting. The original scaling laws paper by OpenAI was slightly wrong because I think of some issues they did with learning rate schedules. And then Chinchilla showed a more correct version.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

I think it's interesting. The original scaling laws paper by OpenAI was slightly wrong because I think of some issues they did with learning rate schedules. And then Chinchilla showed a more correct version.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

And then from then people have again kind of deviated from doing the compute optimal thing because people start now optimizing more so for making the thing work really well given an inference budget. And I think there are a lot more dimensions to these curves than what we originally used of just compute number of parameters and data. like inference compute is the obvious one.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

And then from then people have again kind of deviated from doing the compute optimal thing because people start now optimizing more so for making the thing work really well given an inference budget. And I think there are a lot more dimensions to these curves than what we originally used of just compute number of parameters and data. like inference compute is the obvious one.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

And then from then people have again kind of deviated from doing the compute optimal thing because people start now optimizing more so for making the thing work really well given an inference budget. And I think there are a lot more dimensions to these curves than what we originally used of just compute number of parameters and data. like inference compute is the obvious one.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

I think context length is another obvious one. So if you care, like, let's say you care about the two things of inference compute and then context window, maybe the thing you want to train is some kind of SSM because they're much, much cheaper and faster at super, super long context.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

I think context length is another obvious one. So if you care, like, let's say you care about the two things of inference compute and then context window, maybe the thing you want to train is some kind of SSM because they're much, much cheaper and faster at super, super long context.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

I think context length is another obvious one. So if you care, like, let's say you care about the two things of inference compute and then context window, maybe the thing you want to train is some kind of SSM because they're much, much cheaper and faster at super, super long context.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

And even if maybe it is 10X worse scaling properties during training, meaning you have to spend 10X more compute to train the thing to get the same level of capabilities, it's worth it because you care most about that inference budget for really long context windows. So it'll be interesting to see how people kind of play with all these dimensions.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

And even if maybe it is 10X worse scaling properties during training, meaning you have to spend 10X more compute to train the thing to get the same level of capabilities, it's worth it because you care most about that inference budget for really long context windows. So it'll be interesting to see how people kind of play with all these dimensions.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

And even if maybe it is 10X worse scaling properties during training, meaning you have to spend 10X more compute to train the thing to get the same level of capabilities, it's worth it because you care most about that inference budget for really long context windows. So it'll be interesting to see how people kind of play with all these dimensions.

Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI

I mean, I think bigger is certainly better for just raw performance.