Corey Knowles
๐ค SpeakerVoice Profile Active
This person's voice can be automatically recognized across podcast episodes using AI voice matching.
Appearances Over Time
Podcast Appearances
So we saw you've tested around 200 plus finance tasks with actual Wall Street experts grading GPT-5.
I think it was Gemini 2.5 Pro, Sonnet 4.5.
How did that go?
What surprised you there?
Anything?
That makes sense.
So I guess let's talk a little bit about, you know, kind of the business end here.
Should companies be investing in building their own RL environments and training their own models?
Or is that still a waste of resources in a lot of ways and maybe best left to Frontier Labs?
Yeah.
Okay.
You just told it poorly.
You know, something I've wondered for a long time is, you know, I mean, I remember the idea of a prompt engineer became kind of meme-y.
However, I do genuinely believe that there is a difference in how you ask questions and that we're finding language to play an important role in that and the ability to get a different answer by understanding framing and how you've asked a thing.
And I often wonder,
If, if the field isn't so dense with software engineers that they're missing some of that.
So does this lead to then, say a company comes to you, and obviously we won't use names, but with the model, does this happen in a way that's like, here's this new model, it's really good at this, but it's pretty weak over here on this kind of task.
Can you help strengthen it there so it is more general?
Forever.
Yeah.