Kevin Weil
👤 PersonAppearances Over Time
Podcast Appearances
I submitted it.
An actual legitimate engineer looked at it and said, yeah, this looks right.
And, you know, now there's a few lines of code shipping today that, you know, came from me using Codex.
It speaks to the power of this thing when you can just have this software agent off, like actually solving real world tasks for you.
And in the meantime, I was like writing email and following up on Slack and, you know, doing all the things that I do in my day job.
So it was just purely additive, which I think is really cool.
Yeah, it's pretty meaningful and it's increasing quickly.
The cool thing is you can fire off 10 of these tasks at once, right?
So we try and actually give you the value of all this parallelism.
where it's not just you can do one thing, but if you have a Codex agent working for you, why not have 10 Codex agents working for you on 10 different tasks?
And by the way, just to connect it to the previous topic on evals, this is also... evals are...
there's a really important kind of subtlety to them too, where they have to be tailored to the product that you're trying to build and the problem that you're trying to solve.
Where, you know, coding isn't one thing.
Just coding is a small vertical of the entire world.
But even within coding, you can be good at lots of different kinds of coding.
And with Codex, that was a great example of going and saying, okay, what kinds of coding really matter to us?
What kinds of tasks and all the tasks that a developer does, what kinds of tasks do we really want to be good at?
And we created evals for those.
And then we made sure to monitor as we train the model, is it getting better and better and better at these?
And, you know, you go and accumulate tasks and examples for the model to learn from, but you do it against a specific set of evals that correspond to a specific set of problems you want to solve.