Michael Truell
๐ค SpeakerAppearances Over Time
Podcast Appearances
So the bar for accuracy and for relevance of the context you include should be quite high. But already we do some automatic context in some places within the product. It's definitely something we want to get a lot better at. And I think that there are a lot of cool ideas to try there, both on the learning better retrieval systems, like better embedding models, better re-rankers.
So the bar for accuracy and for relevance of the context you include should be quite high. But already we do some automatic context in some places within the product. It's definitely something we want to get a lot better at. And I think that there are a lot of cool ideas to try there, both on the learning better retrieval systems, like better embedding models, better re-rankers.
I think that there are also cool academic ideas, stuff we've tried out internally, but also the field is grappling with writ large, about can you get language models to a place where you can actually just have the model itself, like understand a new corpus of information, And the most popular talked about version of this is, can you make the context windows infinite?
I think that there are also cool academic ideas, stuff we've tried out internally, but also the field is grappling with writ large, about can you get language models to a place where you can actually just have the model itself, like understand a new corpus of information, And the most popular talked about version of this is, can you make the context windows infinite?
I think that there are also cool academic ideas, stuff we've tried out internally, but also the field is grappling with writ large, about can you get language models to a place where you can actually just have the model itself, like understand a new corpus of information, And the most popular talked about version of this is, can you make the context windows infinite?
Then if you make the context windows infinite, can you make the model actually pay attention to the infinite context? And then after you can make it pay attention to the infinite context, to make it somewhat feasible to actually do it, can you then do caching for that infinite context? You don't have to recompute that all the time.
Then if you make the context windows infinite, can you make the model actually pay attention to the infinite context? And then after you can make it pay attention to the infinite context, to make it somewhat feasible to actually do it, can you then do caching for that infinite context? You don't have to recompute that all the time.
Then if you make the context windows infinite, can you make the model actually pay attention to the infinite context? And then after you can make it pay attention to the infinite context, to make it somewhat feasible to actually do it, can you then do caching for that infinite context? You don't have to recompute that all the time.
But there are other cool ideas that are being tried that are a little bit more analogous to fine-tuning of actually learning this information and the weights of the model. And it might be that you actually get sort of a qualitatively different type of understanding if you do it more at the weight level than if you do it at the in-context learning level.
But there are other cool ideas that are being tried that are a little bit more analogous to fine-tuning of actually learning this information and the weights of the model. And it might be that you actually get sort of a qualitatively different type of understanding if you do it more at the weight level than if you do it at the in-context learning level.
But there are other cool ideas that are being tried that are a little bit more analogous to fine-tuning of actually learning this information and the weights of the model. And it might be that you actually get sort of a qualitatively different type of understanding if you do it more at the weight level than if you do it at the in-context learning level.
I think the jury is still a little bit out on how this is all going to work in the end. But in the interim, us as a company, we are really excited about better retrieval systems and picking the parts of the code base that are most relevant to what you're doing. We could do that a lot better.
I think the jury is still a little bit out on how this is all going to work in the end. But in the interim, us as a company, we are really excited about better retrieval systems and picking the parts of the code base that are most relevant to what you're doing. We could do that a lot better.
I think the jury is still a little bit out on how this is all going to work in the end. But in the interim, us as a company, we are really excited about better retrieval systems and picking the parts of the code base that are most relevant to what you're doing. We could do that a lot better.
But anyway, what do you think about hiding the chain of thought? One consideration for open AI, and this is completely speculative, could be that they want to make it hard for people to distill these capabilities out of their model. it might actually be easier if you had access to that hidden chain of thought to replicate the technology.
But anyway, what do you think about hiding the chain of thought? One consideration for open AI, and this is completely speculative, could be that they want to make it hard for people to distill these capabilities out of their model. it might actually be easier if you had access to that hidden chain of thought to replicate the technology.
But anyway, what do you think about hiding the chain of thought? One consideration for open AI, and this is completely speculative, could be that they want to make it hard for people to distill these capabilities out of their model. it might actually be easier if you had access to that hidden chain of thought to replicate the technology.
Because that's pretty important data, like seeing the steps that the model took to get to the final result. So you could probably train on that also.
Because that's pretty important data, like seeing the steps that the model took to get to the final result. So you could probably train on that also.
Because that's pretty important data, like seeing the steps that the model took to get to the final result. So you could probably train on that also.