Trenton Bricken

👤 Speaker

See mentions of this person in podcasts

1589 total appearances

Appearances Over Time

Podcast Appearances

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

His attention heads are querying his context, but then how he's projecting his query and keys in the space and how his MLPs are then retrieving longer term facts or modifying that information is allowing him to then in later layers do even more sophisticated queries and slowly be able to reason through and come to a meaningful conclusion.

1592.677 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Something I think that the people who aren't doing this research can overlook is after your first layer of the model, every query key and value that you're using for attention

1639.295 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

comes from the combination of all the previous tokens.

1654.102 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

So like my first layer, I'll query my previous tokens and just extract information from them.

1656.985 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

But all of a sudden, let's say that I attended to tokens one, two, and four in equal amounts, then the vector in my residual stream, assuming that they wrote out the same thing to the value vectors, but ignore that for a second, is a third of each of those.

1661.629 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And so when I'm querying in the future, my query is actually a third of each of those things.

1676.623 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

But they might be written to different subspaces.

1681.928 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

That's right.

1684.752 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Hypothetically, but they wouldn't have to.

1685.433 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And so you can recombine and immediately, even by layer two and certainly by the deeper layers, just have these very rich vectors that are packing in a ton of information.

1687.817 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And the causal graph is literally over every single layer that happened in the past.

1697.432 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

And that's what you're operating on.

1702.259 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

You can get an LLM to write it.

1730.95 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Or we could purposely exclude it, right?

1732.512 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

Well, you need to scrape any discussion of it from Reddit or any other thing, right?

1736.197 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

So for me, because this is a very legitimate response, right?

1879.552 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

It's like, well, artificial general intelligence aren't, if you say humans are generally intelligent, then they're no more capable or competent.

1883.642 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

I'm just worried that you have that level of general intelligence in silicon.

1891.442 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

where you can then immediately clone hundreds of thousands of agents and they don't need to sleep and they can have super long context windows and then they can start recursively improving and then things get really scary.

1895.532 View full episode →

Dwarkesh Podcast

Sholto Douglas & Trenton Bricken - How to Build & Understand GPT-7's Mind

So I think to answer your original question, yes, you're right.

1907.233 View full episode →

← Previous Page 48 of 80 Next →

Report any issue