Dylan Patel
๐ค SpeakerAppearances Over Time
Podcast Appearances
Why can't the model just reason and have a database that it writes stuff in or like a Word document that it writes stuff in and then it takes it out of its context, works some more and like roll calls back.
It's like, oh yeah, we don't do that, right?
Like you and I refer to our notes, we refer to our calendar, we refer to our texts, we refer to anything, all the shopping list, right?
Like great, I know I need food for dinner.
I go to the store, I'm like, I need a shopping list because otherwise I'm going to buy like stupid shit.
So the model doesn't necessarily have to fundamentally work the same way as humans.
But there is that challenge of like, how do I train the model to operate over the context length of a human?
How do I train it to interact with these databases and these Word documents that it writes to?
Because it's never going to learn that from pre-training.
has to learn that from an environment.
But these environments have to be like architected in a way where the model knows it can write stuff down and refer back.
And so one of the first things opening I did was deep research.
Everything is not in deep research's context.
Deep research is working for like 45 minutes.
It's outputting millions and millions of tokens.
And it's creating this amazing thing that it wrote.
And it's pretty good research.
I would say a lot of like memos that you read from people are like on par with like deep research, at least like a junior.
How they did that was they enabled it to be able to write something down elsewhere and have this recall and effectively use language to compress information that it looked at, put that off to the side, use language to compress other information off to the side, use language to compress other information off to the side, and then looking at all this compressed information and writing something.
That's sort of what deep research is.