Michael Truel
๐ค SpeakerAppearances Over Time
Podcast Appearances
But there are other cool ideas that are being tried that are a little bit more analogous to fine-tuning of actually learning this information and the weights of the model. And it might be that you actually get sort of a qualitatively different type of understanding if you do it more at the weight level than if you do it at the in-context learning level.
I think the jury is still a little bit out on how this is all going to work in the end. But in the interim, us as a company, we are really excited about better retrieval systems and picking the parts of the code base that are most relevant to what you're doing. We could do that a lot better.
I think the jury is still a little bit out on how this is all going to work in the end. But in the interim, us as a company, we are really excited about better retrieval systems and picking the parts of the code base that are most relevant to what you're doing. We could do that a lot better.
I think the jury is still a little bit out on how this is all going to work in the end. But in the interim, us as a company, we are really excited about better retrieval systems and picking the parts of the code base that are most relevant to what you're doing. We could do that a lot better.
But anyway, what do you think about hiding the chain of thought? One consideration for open AI, and this is completely speculative, could be that they want to make it hard for people to distill these capabilities out of their model. it might actually be easier if you had access to that hidden chain of thought to replicate the technology.
But anyway, what do you think about hiding the chain of thought? One consideration for open AI, and this is completely speculative, could be that they want to make it hard for people to distill these capabilities out of their model. it might actually be easier if you had access to that hidden chain of thought to replicate the technology.
But anyway, what do you think about hiding the chain of thought? One consideration for open AI, and this is completely speculative, could be that they want to make it hard for people to distill these capabilities out of their model. it might actually be easier if you had access to that hidden chain of thought to replicate the technology.
Because that's pretty important data, like seeing the steps that the model took to get to the final result. So you could probably train on that also.
Because that's pretty important data, like seeing the steps that the model took to get to the final result. So you could probably train on that also.
Because that's pretty important data, like seeing the steps that the model took to get to the final result. So you could probably train on that also.
And there was sort of a mirror situation with this, with some of the large language model providers, and also this is speculation, but some of these APIs used to offer easy access to log probabilities for all the tokens that they're generating, and also log probabilities for the prompt tokens. And then some of these APIs took those away. And again, complete speculation, but...
And there was sort of a mirror situation with this, with some of the large language model providers, and also this is speculation, but some of these APIs used to offer easy access to log probabilities for all the tokens that they're generating, and also log probabilities for the prompt tokens. And then some of these APIs took those away. And again, complete speculation, but...
And there was sort of a mirror situation with this, with some of the large language model providers, and also this is speculation, but some of these APIs used to offer easy access to log probabilities for all the tokens that they're generating, and also log probabilities for the prompt tokens. And then some of these APIs took those away. And again, complete speculation, but...
One of the thoughts is that the reason those were taken away is if you have access log probabilities, similar to this hidden train of thought, that can give you even more information to try and distill these capabilities out of the APIs, out of these biggest models, into models you control. As an asterisk on also the previous discussion about us integrating O1.
One of the thoughts is that the reason those were taken away is if you have access log probabilities, similar to this hidden train of thought, that can give you even more information to try and distill these capabilities out of the APIs, out of these biggest models, into models you control. As an asterisk on also the previous discussion about us integrating O1.
One of the thoughts is that the reason those were taken away is if you have access log probabilities, similar to this hidden train of thought, that can give you even more information to try and distill these capabilities out of the APIs, out of these biggest models, into models you control. As an asterisk on also the previous discussion about us integrating O1.
I think that we're still learning how to use this model. So we made O1 available in Cursor because when we got the model, we were really interested in trying it out. I think a lot of programmers are going to be interested in trying it out. But O1 is not part of the default cursor experience in any way yet.
I think that we're still learning how to use this model. So we made O1 available in Cursor because when we got the model, we were really interested in trying it out. I think a lot of programmers are going to be interested in trying it out. But O1 is not part of the default cursor experience in any way yet.
I think that we're still learning how to use this model. So we made O1 available in Cursor because when we got the model, we were really interested in trying it out. I think a lot of programmers are going to be interested in trying it out. But O1 is not part of the default cursor experience in any way yet.
And we still haven't found a way to yet integrate it into the editor in a way that we reach for sort of every hour, maybe even every day. And so I think the jury's still out on how to use the model. And we haven't seen examples yet of people releasing things where... It seems really clear, like, oh, that's like now the use case.