Aman Sanger
๐ค SpeakerAppearances Over Time
Podcast Appearances
And so because, again, embeddings are the bottleneck, you can do one clever trick and not have to worry about the complexity of dealing with branches and the other databases where you just... have some cache on the actual vectors computed from the hash of a given chunk. And so this means that when the nth person at a company goes and invents their code base, it's really, really fast.
And you do all this without actually storing any code on our servers at all. No code data is stored. We just store the vectors in the vector database and the vector cache.
And you do all this without actually storing any code on our servers at all. No code data is stored. We just store the vectors in the vector database and the vector cache.
And you do all this without actually storing any code on our servers at all. No code data is stored. We just store the vectors in the vector database and the vector cache.
I think, like you mentioned, in the future, I think this is only going to get more and more powerful where we're working a lot on improving the quality of our retrieval. And I think the ceiling for that is really, really much higher than people give it credit for.
I think, like you mentioned, in the future, I think this is only going to get more and more powerful where we're working a lot on improving the quality of our retrieval. And I think the ceiling for that is really, really much higher than people give it credit for.
I think, like you mentioned, in the future, I think this is only going to get more and more powerful where we're working a lot on improving the quality of our retrieval. And I think the ceiling for that is really, really much higher than people give it credit for.
Yeah, like an approximate nearest neighbors and this massive code base is going to just eat up your memory and your CPU. And that's just that. Let's talk about also the modeling side where, as Arvid said, there are these massive headwinds against local models where, one, things seem to move towards MOEs.
Yeah, like an approximate nearest neighbors and this massive code base is going to just eat up your memory and your CPU. And that's just that. Let's talk about also the modeling side where, as Arvid said, there are these massive headwinds against local models where, one, things seem to move towards MOEs.
Yeah, like an approximate nearest neighbors and this massive code base is going to just eat up your memory and your CPU. And that's just that. Let's talk about also the modeling side where, as Arvid said, there are these massive headwinds against local models where, one, things seem to move towards MOEs.
One benefit is maybe they're more memory bandwidth bound, which plays in favor of local versus using GPUs or using NVIDIA GPUs. But the downside is these models are just bigger in total. And they're going to need to fit often not even on a single node, but multiple nodes. There's no way that's going to fit inside of even really good MacBooks. And I think especially for coding,
One benefit is maybe they're more memory bandwidth bound, which plays in favor of local versus using GPUs or using NVIDIA GPUs. But the downside is these models are just bigger in total. And they're going to need to fit often not even on a single node, but multiple nodes. There's no way that's going to fit inside of even really good MacBooks. And I think especially for coding,
One benefit is maybe they're more memory bandwidth bound, which plays in favor of local versus using GPUs or using NVIDIA GPUs. But the downside is these models are just bigger in total. And they're going to need to fit often not even on a single node, but multiple nodes. There's no way that's going to fit inside of even really good MacBooks. And I think especially for coding,
It's not a question as much of like, does it clear some bar of like the models good enough to do these things? And then like we're satisfied, which may be the case for other problems and maybe where local models shine. But people are always going to want the best, the most intelligent, the most capable things. And that's going to be really, really hard to run for almost all people locally.
It's not a question as much of like, does it clear some bar of like the models good enough to do these things? And then like we're satisfied, which may be the case for other problems and maybe where local models shine. But people are always going to want the best, the most intelligent, the most capable things. And that's going to be really, really hard to run for almost all people locally.
It's not a question as much of like, does it clear some bar of like the models good enough to do these things? And then like we're satisfied, which may be the case for other problems and maybe where local models shine. But people are always going to want the best, the most intelligent, the most capable things. And that's going to be really, really hard to run for almost all people locally.
Why do you think it's different than cloud providers?
Why do you think it's different than cloud providers?
Why do you think it's different than cloud providers?
Like one interesting proof of concept for the learning this knowledge directly in the weights is with VS Code. So we're in a VS Code fork and VS Code, the code is all public. So these models in pre-training have seen all the code. They've probably also seen questions and answers about it, and then they've been fine-tuned and RLHFed to be able to answer questions about code in general.