Sholto Douglas

👤 Speaker

1567 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Yeah, theory of mind.

8472.838 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Like the whole thing that could cause deception of the whole thing or like is it just one instance of it?

8478.318 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Second of all, are your like labels correct?

8483.826 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

You know, maybe like you thought this wasn't deceptive.

8485.488 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

It's like still deceptive.

8487.832 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Especially if it's producing output you can't understand.

8489.053 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Third, is the thing that's going to be the bad outcome something that's even human understandable?

8491.236 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Like deception is a concept we can understand.

8496.624 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Maybe there's like a...

8498.547 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

There's a separate question of does such representation exist, which it seems like there must, or actually, I'm not sure if that's the case.

8587.212 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And secondly, whether using this parser encoder setup, you could find it.

8595.127 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And in this case, if you don't have labels for it that are adequate to represent it, like you wouldn't find it, right?

8601.198 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

This is sort of a tangent, but one interesting idea I heard was if that space is shared between models, you can imagine trying to find it in an open source model to then make... Like Gemma is... They said in the paper... Gemma, by the way, Google's newly released open source model.

8697.76 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

They said in the paper it's trained using the same architecture or something like that.

8716.016 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

To be honest, I didn't know because I haven't read the Gemma paper.

8720.02 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Similar methods to something whatever as Gemini.

8722.823 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

So to the extent that's true,

8726.005 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

I don't know how much like how much of the rec teaming you do on Gemma is like potentially helping you jailbreak into Gemini.

8727.747 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

But by the way, this is another tangent.

8795.805 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

To the extent that that's true, and I guess there's evidence that that's true, why doesn't curriculum learning work?

8797.91 View full episode →

← Previous Page 67 of 79 Next →

Report any issue