Peter McCrory
👤 SpeakerAppearances Over Time
Podcast Appearances
I have found that even with Opus 4.5, there's still some level of oversight and quality attention that I need to maintain, but I actually worry a lot less about the implementation steps.
And so I'll give somewhat of a concrete example.
where analysis that I've been wanting to do for a while, and I haven't had the time to do it.
And it sort of touches on this idea of how does AI exposure across jobs as commonly constructed?
How does that relate to business cycle sensitivity?
Is it the case that workers who have high AI exposure, are they the ones who
might have higher unemployment when the labor market slows down.
I've been giving Claude for some time now in a sort of separate server to run autonomously this exercise of downloading some research papers, implementing the sort of replicating some prior reports, and then coming up with this like new analysis that I wanted to do.
Before Opus 4.5, I had low confidence that this would work.
And then when I gave the same task to Opus 4.5, I discovered that it was able to move reasonably proficiently at implementing even somewhat ambiguous directives.
But I still had a lot of back and forth.
It was like having a very capable research assistant, much like the job that I had when I was fresh out of college working at the St.
Louis Fed.
That type of implementation capacity, the ability to understand how to run and set up regressions, how to download data through writing a new API, Opus 4.5 was able to do all of that and even write up a documentation methodological report.
I suspect that we will see sort of the level of managerial oversight sort of move to a higher level as people begin to delegate maybe like lower, more straightforward, but nevertheless sometimes ambiguous implementation steps.
It's a great question.
And in the report, the 50% at sort of 17 hours or around like 20 hours is an extrapolation of the linear fit that we document.
So I don't think we actually see those very long tasks that you're describing.
It's just sort of the implication of the data that we do
see in our data.