Manas Joglekar

👤 Speaker

168 total appearances

Appearances Over Time

Podcast Appearances

"Why we are excited about confession!" by boazbarak, Gabriel Wu, Manas Joglekar

Confessions, by their nature, are retrospective and serve to report on misalignment rather than preventing it in the first place.

995.204 View full episode →

LessWrong (Curated & Popular)

"Why we are excited about confession!" by boazbarak, Gabriel Wu, Manas Joglekar

We are also excited to explore more high-compute interventions aimed at improving alignment in the model's main outputs.

1002.828 View full episode →

LessWrong (Curated & Popular)

"Why we are excited about confession!" by boazbarak, Gabriel Wu, Manas Joglekar

Heading Acknowledgements Thank you to Bowen Baker for his thoughtful comments.

1009.905 View full episode →

LessWrong (Curated & Popular)

"Why we are excited about confession!" by boazbarak, Gabriel Wu, Manas Joglekar

Additionally, thank you to Cameron Raymond, Marcus Williams, Rose Wang,

1016.521 View full episode →

LessWrong (Curated & Popular)

"Why we are excited about confession!" by boazbarak, Gabriel Wu, Manas Joglekar

Adam Collet, Kai Chen, Erin Cavanaugh, Ishan Singhal, and Apollo Research for designing some of the evaluations for reward hacking, hallucinations, instruction hierarchy, instruction following, and scheming we used in our work.

1020.952 View full episode →

LessWrong (Curated & Popular)

"Why we are excited about confession!" by boazbarak, Gabriel Wu, Manas Joglekar

This article was narrated by Type 3 Audio for LessWrong.

1036.106 View full episode →

LessWrong (Curated & Popular)

"Why we are excited about confession!" by boazbarak, Gabriel Wu, Manas Joglekar

It was published on January 14, 2026.

1040.53 View full episode →

LessWrong (Curated & Popular)

"Why we are excited about confession!" by boazbarak, Gabriel Wu, Manas Joglekar

Images are included in the podcast episode description.

1043.153 View full episode →

← Previous Page 9 of 9 Next →

Report any issue

Manas Joglekar

Appearances Over Time

Podcast Appearances

Sign in to Audioscrape

Share this moment