Sholto Douglas
๐ค SpeakerAppearances Over Time
Podcast Appearances
Just flip the sign.
And so once you get to the stage where models are capable of implementing one of no one's ideas, then you can just let them loose and let them build that intuition of how to do scientific discovery.
The key thing here again is the feedback loops.
I expect scientific areas where you are able to put it in a feedback loop to have eventually superhuman performance.
MARK MANDELBACHER- And again, software engineering is going to be the leading indicator of that, right?
Like over the next six months, like the remainder of the year, basically we're going to see progressively more and more experiments of the form of how can I dispatch work to a software engineering agent in such a way that it's async?
Clawed for is GitHub integration, where you can ask it to do things on GitHub, ask it to do pull requests, this kind of stuff that's coming up.
And the OpenAI's codecs are examples of this, basically.
You can almost see this in the coding startups.
I think of this product exponential in some respects, where you need to be designing for a few months ahead of the model to make sure that the product you build is the right one.
And you saw last year, Cursor hit PMF with Cloud 3.5 Sonnet.
They were around for a while before, but then the model was finally good enough that the vision they had of how people would program hit.
And then, Windsurf bet a little bit more aggressively even on the agenticness of the model, with longer running agentic workflows and this kind of stuff.
And I think that's when they began competing with Cursors, when they bet on that particular vision.
And the next one is you're not even in the loop, so to speak.
You're not in an IDE, but you're asking the model to go do work in the same way that you would ask someone on your team to go do work.
that is not quite ready yet.
Like there are still a lot of tasks where you need to be in the loop.
But the next six months looks like an exploration of like exactly what does that trend line look like?
And we're almost certainly like under eliciting dramatically.