Menu
Sign In Search Podcasts Libraries Charts People & Topics Add Podcast API Blog Pricing

Marc Brooker

๐Ÿ‘ค Speaker
499 total appearances

Appearances Over Time

Podcast Appearances

The Peterman Pod
AWS Distinguished Eng: Learning From 3000 Incidents And How Engineering Is Changing | Marc Brooker

It's more powerful than ever.

The Peterman Pod
AWS Distinguished Eng: Learning From 3000 Incidents And How Engineering Is Changing | Marc Brooker

Fantastic.

The Peterman Pod
AWS Distinguished Eng: Learning From 3000 Incidents And How Engineering Is Changing | Marc Brooker

But where you really want to spend the time of the deep experts on your team is, here's something unexpected or unusual that's happened in the system.

The Peterman Pod
AWS Distinguished Eng: Learning From 3000 Incidents And How Engineering Is Changing | Marc Brooker

Let's deeply understand that and let's bring that knowledge back to both improving that system and communicating broadly to the company and the outside community what we've learned from that.

The Peterman Pod
AWS Distinguished Eng: Learning From 3000 Incidents And How Engineering Is Changing | Marc Brooker

And so one of the most powerful things we do at AWS is we have this mechanism of a very broad weekly meeting where we all get together, engineers from across AWS, leaders, senior leaders from across AWS, and talk about COEs, these postmortems that we write.

The Peterman Pod
AWS Distinguished Eng: Learning From 3000 Incidents And How Engineering Is Changing | Marc Brooker

and what we can learn from them and how we can apply those lessons across the whole company.

The Peterman Pod
AWS Distinguished Eng: Learning From 3000 Incidents And How Engineering Is Changing | Marc Brooker

And I think that particular mechanism, that particular kind of Wednesday morning meeting that we have is one of the things that has been a core, almost causal factor behind AWS's success.

The Peterman Pod
AWS Distinguished Eng: Learning From 3000 Incidents And How Engineering Is Changing | Marc Brooker

because it has allowed us to and forced us to spend leadership bandwidth, to spend expertise, to spend the time of our best engineers deeply understanding how our systems operate and why they operate the way they do.

The Peterman Pod
AWS Distinguished Eng: Learning From 3000 Incidents And How Engineering Is Changing | Marc Brooker

And that level of being just extremely grounded in reality

The Peterman Pod
AWS Distinguished Eng: Learning From 3000 Incidents And How Engineering Is Changing | Marc Brooker

helps you design better products, helps you architect better systems, helps you think more clearly about the next round of things, helps you fix issues.

The Peterman Pod
AWS Distinguished Eng: Learning From 3000 Incidents And How Engineering Is Changing | Marc Brooker

And so it's this fundamental kind of learning exercise.

The Peterman Pod
AWS Distinguished Eng: Learning From 3000 Incidents And How Engineering Is Changing | Marc Brooker

It's a real blessing.

The Peterman Pod
AWS Distinguished Eng: Learning From 3000 Incidents And How Engineering Is Changing | Marc Brooker

So I would recommend on call to anybody who wants to learn about the practice of distributed systems, and I would certainly recommend spending time reading COEs, reading postmortems, and deeply reflecting on not only what can we fix tactically, but what can we fix organizationally and strategically, and what kind of tools might need to exist to prevent this kind of thing happening again.

The Peterman Pod
AWS Distinguished Eng: Learning From 3000 Incidents And How Engineering Is Changing | Marc Brooker

And you asked earlier about where do ideas come from?

The Peterman Pod
AWS Distinguished Eng: Learning From 3000 Incidents And How Engineering Is Changing | Marc Brooker

This is another fantastic kind of flow of ideas of saying, wow, we seem to be solving this same problem over and over in different ways and getting it slightly wrong every time.

The Peterman Pod
AWS Distinguished Eng: Learning From 3000 Incidents And How Engineering Is Changing | Marc Brooker

Can we extract a tool to do that?

The Peterman Pod
AWS Distinguished Eng: Learning From 3000 Incidents And How Engineering Is Changing | Marc Brooker

Can we build a service around that?

The Peterman Pod
AWS Distinguished Eng: Learning From 3000 Incidents And How Engineering Is Changing | Marc Brooker

Can we build a feature around that to make it easier for us to get right and easier for our customers to get right?

The Peterman Pod
AWS Distinguished Eng: Learning From 3000 Incidents And How Engineering Is Changing | Marc Brooker

Yeah, and again, I think for me, it comes down to optimizing for finding the most important things to work on.

The Peterman Pod
AWS Distinguished Eng: Learning From 3000 Incidents And How Engineering Is Changing | Marc Brooker

And if you aren't close to operating your actual system and you don't know how it's actually working,