Nathaniel Whittemore

The rate at which anthropic staff correct, redirect, or take over mid-task from Claude has been falling steadily for a year, including on the most complex and open-ended tasks.

1108.486 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

What OpenAI and Anthropic Think Happens Next With AI

This means problems with no clear specification, where the engineer isn't sure what the answer looks like.

1116.714 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

What OpenAI and Anthropic Think Happens Next With AI

As evidence, they point to a chart of Claude Code's session success rate, where across trivial tasks, routine tasks, substantial tasks, and open-ended problems, the success rate of all of those has climbed well above 60%, and for the trivial, routine, and substantial tasks, well above 80%, from a much lower place less than a year ago.

1121.358 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

What OpenAI and Anthropic Think Happens Next With AI

They also note that the mode of how Claude interacts with the codebase is changing.

1138.189 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

What OpenAI and Anthropic Think Happens Next With AI

Claude, they write, is getting better at proposing its own experiments.

1143.599 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

What OpenAI and Anthropic Think Happens Next With AI

They point to research that was published in April of this year that was exploring whether a weaker AI could manage a stronger AI.

1147.004 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

What OpenAI and Anthropic Think Happens Next With AI

The evidence suggests that the human role is narrowing at each step in the AI development process.

1153.373 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

What OpenAI and Anthropic Think Happens Next With AI

Once human and AI-authored code quality reach parity, humans will stop writing code entirely and shift to only reviewing it.

1157.9 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

What OpenAI and Anthropic Think Happens Next With AI

But if they can't review code as quickly as Claude can generate it, human review will become the bottleneck to AI development.

1163.508 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

What OpenAI and Anthropic Think Happens Next With AI

Similarly, once Claude can run experiments, the question shifts towards which of those experiments is worth running.

1168.675 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

What OpenAI and Anthropic Think Happens Next With AI

Put simply, the doing, writing the code, running the experiment, producing the result, now costs almost nothing in human time, even if it still has costs in compute.

1174.101 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

What OpenAI and Anthropic Think Happens Next With AI

An area of human comparative advantage for now is research, taste, and judgment, including choosing which problems matter, which results to trust, and when an approach is a dead end.

1182.731 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

What OpenAI and Anthropic Think Happens Next With AI

Indeed, they continue, "...the work that is still in human hands, choosing which problems to work on, is what matters most."

1192.001 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

What OpenAI and Anthropic Think Happens Next With AI

Without that judgment, Claude is a capable assistant, but not as a system that could drive AI progress on its own.

1197.447 View full episode →

The AI Daily Brief: Artificial Intelligence News and Analysis

What OpenAI and Anthropic Think Happens Next With AI

They write, it's genuinely unclear whether today's training methods and architectures could unlock that capacity.

1202.879 View full episode →

Voice Profile Active

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment