Jaden Schaefer
๐ค SpeakerAppearances Over Time
Podcast Appearances
According to them, they said, we're trying to train the model in a fundamentally different way.
That's Yuchen He, one of their co-founders and who's formerly an OpenAI researcher said,
And they're planning to rely on long horizon reinforcement learning and multi-agent reinforcement learning.
So these are both techniques that are kind of designed to help models plan, revise, and follow through over extended periods.
And also across, you know, a lot of different participants.
So long horizon reinforcement learning, it basically is focusing on outcomes over time rather than just like one-off responses.
And multi-agent reinforcement learning trains a system to operate in environments where
There's multiple humans and AIs are all kind of interacting simultaneously.
Right.
So both of these approaches, I think, are getting a lot of traction in academic research.
The field is kind of pushing beyond just chatbots and kind of pushing towards systems that are capable of coordination and sustained action.
So if they can come out on the right side of of a really powerful kind of like agent coordination tool, I think that they will build something that a lot of people want.
they one of their co founders, he said, the model needs to remember things about itself and about you, the better its memory, the better its understanding.
So I do think that's true.
But I also do think a lot of people are building memory and doing a lot of things there.
So they're not the only ones.
I think they're
You know, there's a lot of optimism in this company.
I mean, they were just able to raise almost half a billion dollars, but I think there still is a lot of risks.
Training and scaling a brand new model is very expensive, right?