Aman Sanger
π€ SpeakerAppearances Over Time
Podcast Appearances
bring down you know like just bring down the server and like you like of course we like test a lot and whatever but there's always these things that you have to be very careful yeah like with just normal doc strings i think people will often just skim it when making a change and think oh this i know how to do this um and you kind of really need to point it out to them so that doesn't slip through yeah you have to be reminded that you can do a lot of damage
But concretely, what do you think that future would look like?
Yeah. The worry is there's this massive document. Replacing something like unit tests, sure.
or how do you handle i guess external dependencies like calling the stripe api maybe stripe would write a spec for but like you can't do this for everything like can you do this for everything you use like how do you how do you do it for if there's a language like maybe maybe like people will use language models as primitives in the programs they write and there's like a dependence on it and like how how do you now include that i think you might be able to prove prove that still
Prove what about language models?
Yeah, I was going to say the moment that feels like people do this is when they share it, when they have this fantastic example, they just kind of share it with their friends.
You can use terminal context as well inside of check, mank, kind of everything. We don't have the looping part yet, though we suspect something like this could make a lot of sense. There's a question of whether it happens in the foreground too, or if it happens in the background, like what we've been discussing.
It would be really interesting if you could branch a file system.
Yeah, it's called the Merkel tree.
Yeah. And there are a lot of clever things, like additional things that go into this indexing system. For example, the bottleneck in terms of costs is not storing things in the vector database or the database. It's actually embedding the code.
And you don't want to re-embed the code base for every single person in a company that is using the same exact code, except for maybe they're in a different branch with a few different files or they've made a few local changes.
And so because, again, embeddings are the bottleneck, you can do one clever trick and not have to worry about the complexity of dealing with branches and the other databases where you just... have some cache on the actual vectors computed from the hash of a given chunk. And so this means that when the nth person at a company goes and invents their code base, it's really, really fast.
And you do all this without actually storing any code on our servers at all. No code data is stored. We just store the vectors in the vector database and the vector cache.
I think, like you mentioned, in the future, I think this is only going to get more and more powerful where we're working a lot on improving the quality of our retrieval. And I think the ceiling for that is really, really much higher than people give it credit for.
Yeah, like an approximate nearest neighbors and this massive code base is going to just eat up your memory and your CPU. And that's just that. Let's talk about also the modeling side where, as Arvid said, there are these massive headwinds against local models where, one, things seem to move towards MOEs.
One benefit is maybe they're more memory bandwidth bound, which plays in favor of local versus using GPUs or using NVIDIA GPUs. But the downside is these models are just bigger in total. And they're going to need to fit often not even on a single node, but multiple nodes. There's no way that's going to fit inside of even really good MacBooks. And I think especially for coding,
It's not a question as much of like, does it clear some bar of like the models good enough to do these things? And then like we're satisfied, which may be the case for other problems and maybe where local models shine. But people are always going to want the best, the most intelligent, the most capable things. And that's going to be really, really hard to run for almost all people locally.
Why do you think it's different than cloud providers?
Like one interesting proof of concept for the learning this knowledge directly in the weights is with VS Code. So we're in a VS Code fork and VS Code, the code is all public. So these models in pre-training have seen all the code. They've probably also seen questions and answers about it, and then they've been fine-tuned and RLHFed to be able to answer questions about code in general.
So when you ask it a question about VS Code, sometimes it'll hallucinate, but sometimes it actually does a pretty good job at answering the question. And I think like this is just by, it happens to be okay at it. But what if you could actually like specifically train or post train a model such that it really was built to understand this code base?