Aman Sanger
๐ค SpeakerAppearances Over Time
Podcast Appearances
My intuition would just say, yeah, it should be. This is kind of going back to like if you if you believe p does not equal np then there's this massive class of problems that are much much easier to verify given a proof than actually proving it i wonder if the same thing will prove p not equal to np or p equal to np that would be that would be really cool
My intuition would just say, yeah, it should be. This is kind of going back to like if you if you believe p does not equal np then there's this massive class of problems that are much much easier to verify given a proof than actually proving it i wonder if the same thing will prove p not equal to np or p equal to np that would be that would be really cool
My intuition would just say, yeah, it should be. This is kind of going back to like if you if you believe p does not equal np then there's this massive class of problems that are much much easier to verify given a proof than actually proving it i wonder if the same thing will prove p not equal to np or p equal to np that would be that would be really cool
I feel like I have much more to do there. It felt like the path to get to IMO was a little bit more clear because it already could get a few IMO problems. And there are a bunch of like there's a bunch of low hanging fruit given the literature at the time of like what what tactics people could take. I think I'm one much less versed in the space of theorem proving now.
I feel like I have much more to do there. It felt like the path to get to IMO was a little bit more clear because it already could get a few IMO problems. And there are a bunch of like there's a bunch of low hanging fruit given the literature at the time of like what what tactics people could take. I think I'm one much less versed in the space of theorem proving now.
I feel like I have much more to do there. It felt like the path to get to IMO was a little bit more clear because it already could get a few IMO problems. And there are a bunch of like there's a bunch of low hanging fruit given the literature at the time of like what what tactics people could take. I think I'm one much less versed in the space of theorem proving now.
And two, yeah, less intuition about how close we are to solving these really, really hard open problems.
And two, yeah, less intuition about how close we are to solving these really, really hard open problems.
And two, yeah, less intuition about how close we are to solving these really, really hard open problems.
I think we might get feels metal before AGI.
I think we might get feels metal before AGI.
I think we might get feels metal before AGI.
I think it's interesting. The original scaling laws paper by OpenAI was slightly wrong because I think of some issues they did with learning rate schedules. And then Chinchilla showed a more correct version.
I think it's interesting. The original scaling laws paper by OpenAI was slightly wrong because I think of some issues they did with learning rate schedules. And then Chinchilla showed a more correct version.
I think it's interesting. The original scaling laws paper by OpenAI was slightly wrong because I think of some issues they did with learning rate schedules. And then Chinchilla showed a more correct version.
And then from then people have again kind of deviated from doing the compute optimal thing because people start now optimizing more so for making the thing work really well given an inference budget. And I think there are a lot more dimensions to these curves than what we originally used of just compute number of parameters and data. like inference compute is the obvious one.
And then from then people have again kind of deviated from doing the compute optimal thing because people start now optimizing more so for making the thing work really well given an inference budget. And I think there are a lot more dimensions to these curves than what we originally used of just compute number of parameters and data. like inference compute is the obvious one.
And then from then people have again kind of deviated from doing the compute optimal thing because people start now optimizing more so for making the thing work really well given an inference budget. And I think there are a lot more dimensions to these curves than what we originally used of just compute number of parameters and data. like inference compute is the obvious one.
I think context length is another obvious one. So if you care, like, let's say you care about the two things of inference compute and then context window, maybe the thing you want to train is some kind of SSM because they're much, much cheaper and faster at super, super long context.
I think context length is another obvious one. So if you care, like, let's say you care about the two things of inference compute and then context window, maybe the thing you want to train is some kind of SSM because they're much, much cheaper and faster at super, super long context.