Arvid Lunnemark
π€ SpeakerAppearances Over Time
Podcast Appearances
And then there is MLA.
So to be clear, we want there to be a lot of stuff happening in the background, and we're experimenting with a lot of things. Right now, we don't have much of that happening, other than the cache warming or figuring out the right context that goes into your command key prompts, for example. But the idea is, if you can actually spend computation in the background, then you can help...
help the user maybe at a slightly longer time horizon than just predicting the next few lines that you're going to make. But actually, in the next 10 minutes, what are you going to make? And by doing it in the background, you can spend more computation doing that. And so the idea of the shadow workspace that we implemented, and we use it internally for experiments, is that
to actually get advantage of doing stuff in the background, you want some kind of feedback signal to give back to the model. Because otherwise, like you can get higher performance by just letting the model think for longer. And so like O1 is a good example of that. But another way you can improve performance is by letting the model... iterate and get feedback.
And so one very important piece of feedback when you're a programmer is the language server, which is this thing that exists for most different languages, and there's like a separate language server per language. And it can tell you, you know, you're using the wrong type here, and then gives you an error, or it can allow you to go to definition and sort of understands the structure of your code.
So language servers are extensions developed by, like there's a TypeScript language server developed by the TypeScript people, a Rust language server developed by the Rust people, and then they all interface over the language server protocol to VS Code.
So that VS Code doesn't need to have all of the different languages built into VS Code, but rather you can use the existing compiler infrastructure.
It's for linting. It's for going to definition and for like seeing the right types that you're using.
Yes, type checking and going to references. And that's like, when you're working in a big project, you kind of need that. If you don't have that, it's like really hard to code in a big project.
So it's being used in Cursor to show to the programmer, just like in VS Code. But then the idea is you want to show that same information to the models, the IOM models. And you want to do that in a way that doesn't affect the user because you want to do it in background. And so the idea behind the shadow workspace was, okay, like one way we can do this is
we spawn a separate window of Cursor that's hidden. And so you can set this flag and Electron is hidden. There is a window, but you don't actually see it. And inside of this window, the AI agents can modify code however they want, as long as they don't save it because it's still the same folder, and then can get feedback from the linters and go to definition and iterate on their code.
Yeah.
So that's the eventual version. Okay. That's what you want. And a lot of the blog post is actually about how do you make that happen? Because it's a little bit tricky. You want it to be on the user's machine so that it exactly mirrors the user's environment. And then on Linux, you can do this cool thing where you can actually mirror the file system and have the...
AI make changes to the files and it thinks that it's operating on the file level, but actually that's stored in memory and you can create this kernel extension to make it work. Whereas on Mac and Windows it's a little bit more difficult, but it's a fun technical problem so that's why.
Even the smartest models.
And all caps repeated 10 times.
Yeah, and I think that one is also partially also for today's AI models, where if you actually write dangerous, dangerous, dangerous in every single line, the models will pay more attention to that and will be more likely to find bugs in that region.
Yeah, I mean, it's controversial. Some people think it's ugly. Swallowed does not like it.
Until we have formal verification for everything, then you can do whatever you want and you know for certain that you have not introduced a bug if the proof passed.
I think people will just not write tests anymore. And the model will suggest, like you write a function, the model will suggest a spec and you review the spec. And in the meantime, smart reasoning model computes a proof that the implementation follows the spec. And I think that happens for most functions.