Logan Kilpatrick
👤 PersonAppearances Over Time
Podcast Appearances
probably as early innings as possible of like that, like genie world model version of what's possible.
Right.
But even still, I think there's like lots of cool stuff happening in like the AI gaming space where people are using, I talk to folks all the time, people are using AI to like build smart teammates and assistants as one use case.
An example, I see people doing this with like AI powered NPCs, which is really cool.
Right.
you can also just, if you've never built video games before, and as somebody who is like dabbled in building video games, it's like, it's always the like CS 101 problem that I think lots of people are super excited about.
And it's like, you think you're going to go and learn CS so that you could make games.
And then you start making games, you realize like, this is really difficult and really hard.
Um, and like, maybe I don't want to make
games i have this me and my little brother have this conversation all the time because like he really you know he studied computer science and everything because he wanted to make games and as he started making games realized like it's actually not that much fun um so i am optimistic for the world where ai is this interface that like lets people create who want to create and specifically video games it's like one example of the the sort of outcome you'd want um but we're definitely we're definitely early stages of of that story
It's super cool.
I think Game Arena, and I'll cut it to the Kaggle team, and obviously DeepMind's been collaborating with them on this, but they sort of built the infrastructure to do this and put together the arena and everything like that.
And it's a great example of this...
There were a couple of different dimensions, the goal for Game Arena.
One, it's just like cool to see models play games.
And there was lots of people like Magnus Carlsen and others sort of commentating and watching all the chess games specifically that were happening, which is cool.
Um, there's definitely an entertainment value of it, but two, there's this, uh, there's this thread around evals and for folks who haven't built or aren't following this closely, like the challenge with evals right now is the saturation happens so quickly.
Um,
You like even, you know, humanity's last exam, which I'll not talk deeply about, like what's what it's actually testing, but was designed to be really rigorous and difficult to eval that it would take a long time in order to actually have models solve.
And I think already today it's like you're seeing models jump from like zero or one percent to 40 or 50 percent on the order of.