Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

AI Breakdown

Arxiv paper - DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

27 Jan 2025

Description

In this episode, we discuss DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning by DeepSeek-AI. The paper introduces DeepSeek-R1-Zero, a reasoning model trained solely with large-scale reinforcement learning, which exhibits strong reasoning abilities but struggles with readability and language mixing. To overcome these limitations, the authors developed DeepSeek-R1 by adding multi-stage training and cold-start data, achieving performance on par with OpenAI’s models. Additionally, they open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six distilled dense models to support the research community.

Audio
Featured in this Episode

No persons identified in this episode.

Transcription

This episode hasn't been transcribed yet

Help us prioritize this episode for transcription by upvoting it.

0 upvotes
🗳️ Sign in to Upvote

Popular episodes get transcribed faster

Comments

There are no comments yet.

Please log in to write the first comment.