Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Blog Pricing
Podcast Image

AI Post Transformers

Technology

Feed Update Issues

We're having trouble fetching new episodes from this podcast's RSS feed. Last successful update was 2026-03-06 15:19:12.530470. This podcast may be geo-restricted.

Episodes

Showing 201-300 of 458
«« ← Prev Page 3 of 5 Next → »»

MASA: Meta-Awareness via Self-Alignment Reinforcement Learning

26 Oct 2025

Contributed by Lukas

The September 26, 2025 paper introduces a novel reinforcement learning framework called **Meta-Awareness via Self-Alignment (MASA)**, designed to enha...

LLMs Learning from Verbal Feedback Without Scalar Rewards

26 Oct 2025

Contributed by Lukas

The September 25, 2025 collaboration between Sea AI Lab, SUTD, NUS, NTU and University of Waterloo paper proposes an alternative to traditional Reinfo...

Lp-Reg: Low-Probability Tokens Sustain RL Exploration

26 Oct 2025

Contributed by Lukas

The October 3, 2025 paper by Tencent introduces a reinforcement learning technique called **Low-probability Regularization (Lp-Reg)** designed to over...

REFRAG: v2 paper: Efficient RAG Decoding via Context Compression

22 Oct 2025

Contributed by Lukas

The Meta Superintelligence Labs team in collaboration with Rice University and National University of Singapore have followed up with a version 2 of t...

RoBERTa: Robustly Optimized BERT Pretraining Approach

22 Oct 2025

Contributed by Lukas

The July 2019 paper introduces **RoBERTa**, a **robustly optimized BERT pretraining approach**, which is a refined version of the original BERT model....

LightMem: Lightweight Efficient Memory-Augmented Generation

22 Oct 2025

Contributed by Lukas

The October 21, 2025 academic paper introduces **LightMem**, a novel and efficient memory-augmented generation framework designed to enhance Large Lan...

RAG-Anything: Unified Multimodal Knowledge Retrieval Framework

22 Oct 2025

Contributed by Lukas

The October 14, 2025 paper introduxes **RAG-Anything**, a novel and unified framework for **Retrieval-Augmented Generation (RAG)** designed to overcom...

Elastic-Cache: Adaptive KV Caching for Diffusion LLMs

22 Oct 2025

Contributed by Lukas

The October 16, 2025 academic paper introduces **Elastic-Cache**, an innovative, training-free strategy designed to significantly accelerate the infer...

LLM-Guided Hierarchical Retrieval: The LATTICE Framework

22 Oct 2025

Contributed by Lukas

The October 15, 2025 paper details a novel information retrieval framework called **LATTICE**, which uses a Large Language Model (LLM) to perform **hi...

In-Context Learning as Implicit Learning Algorithms

22 Oct 2025

Contributed by Lukas

The May 17, 2023 academic paper explores the nature of **in-context learning (ICL)** in neural sequence models, particularly transformers, by investig...

Dr.LLM: Dynamic Layer Routing in LLMs

22 Oct 2025

Contributed by Lukas

The October 14, 2025 paper is an excerpt from a research paper introducing **Dr.LLM**, a novel, retrofittable framework designed to improve the effici...

A Psychometric Framework for Artificial General Intelligence

22 Oct 2025

Contributed by Lukas

This large collaboration between 29 different institutions proposes a quantifiable framework for defining **Artificial General Intelligence (AGI)**, c...

EssenceBench: Compressing LLM Benchmarks via Redundancy and Genetic Algorithm

22 Oct 2025

Contributed by Lukas

The October 12, 2025 paper introduces **EssenceBench**, a novel methodology for **compressing large language model (LLM) benchmarks** while preserving...

Inheritune: Efficient LLM Training via Attention Collapse

22 Oct 2025

Contributed by Lukas

This June 8, 2025 collaboration between University of Texas and NYU paper describes a newly identified structural inefficiency in Large Language Model...

Structural Understanding of LLM Overthinking

22 Oct 2025

Contributed by Lukas

The October 10, 2025 academic paper from Google DeepMind and the University of Michigan investigates **"overthinking" in large language models (LLMs)*...

Geometric Flows of Logic in LLM Representation Space

18 Oct 2025

Contributed by Lukas

The October 10, 2025 Duke University academic paper introduces a **novel geometric framework** that views Large Language Model (LLM) reasoning as cont...

Mojo: Performance-Portable HPC Kernels on GPUs

18 Oct 2025

Contributed by Lukas

The September 25 2025 academic paper **evaluates the performance and portability** of the novel **Mojo programming language** for high-performance com...

Scaling Reinforcement Learning Compute for LLMs

17 Oct 2025

Contributed by Lukas

This October 15, 2025 collaboration between Meta, UT Austin, UCL, UC Berkeley, Harvard University, and Periodic Labs details a systematic study on sca...

HBF: High Bandwidth Flash for AI Inferencing

15 Oct 2025

Contributed by Lukas

These sources and patent discuss **SanDisk's development of High Bandwidth Flash (HBF)**, a technology designed to address the significant memory and ...

Architectural Migration to Multi-head Latent Attention

15 Oct 2025

Contributed by Lukas

The sources detail a novel method called **MHA2MLA** (Multi-Head Attention to Multi-Head Latent Attention), which efficiently adapts pre-trained large...

COPA: Composable On-Package GPU Architecture for Domain Specialization

15 Oct 2025

Contributed by Lukas

This April 2021 academic paper from **NVIDIA** discusses the challenge of designing **converged GPUs** that efficiently handle the diverging architect...

Performance of Confidential Computing for Large Language Models

11 Oct 2025

Contributed by Lukas

These sources collectively discuss advancements in **scalable, efficient, and secure machine learning (ML) data systems**, often within the context of...

Google: Confidential Computing with Accelerated AI Workloads on GCE

11 Oct 2025

Contributed by Lukas

The provided sources are a collection of Google Cloud documentation and blog excerpts detailing the features and implementation of **Confidential Comp...

AWS: Nitro System: Security, Enclaves, and Generative AI

11 Oct 2025

Contributed by Lukas

These sources provide an extensive overview of **AWS Nitro Enclaves**, an isolated compute environment designed to protect highly sensitive data withi...

Anthropic: Confidential Inference via Trusted Virtual Machines

11 Oct 2025

Contributed by Lukas

These sources, an announcement from Anthropic and a technical whitepaper co-authored with Pattern Labs, provide an **overview of Confidential Inferenc...

RAND: Securing AI Model Weights: Preventing Theft and Misuse

11 Oct 2025

Contributed by Lukas

The provided texts are excerpts from a **RAND Corporation research report** titled "Securing AI Model Weights: Preventing Theft and Misuse of Frontier...

Training-Free GRPO: Policy Optimization via Context Space

11 Oct 2025

Contributed by Lukas

The October 9, 2025 paper from **Tencent Youtu Lab** introduces **Training-Free Group Relative Policy Optimization (Training-Free GRPO)**, a novel met...

Multi-Agent Tool-Integrated Policy Optimization (MATPO)

11 Oct 2025

Contributed by Lukas

The October 6, 2025 paper introduces **Multi-Agent Tool-Integrated Policy Optimization (MATPO)**, a novel reinforcement learning framework designed to...

UniVideo: Unified Video Understanding, Generation, and Editing

11 Oct 2025

Contributed by Lukas

The October 9, 2025 paper details the architecture, training, and evaluation of **UniVideo**, a unified multimodal generative system capable of **hand...

Dragon Hatchling: Brain-Inspired AI Architecture

10 Oct 2025

Contributed by Lukas

This September 30, 2025 paper detail research into **Brain Dynamics Hypothesis (BDH)** models, particularly the **BDH-GPU** architecture, which propos...

AGENTFLOW: In-the-Flow Agentic System Optimization

10 Oct 2025

Contributed by Lukas

The October 7, 2025 joint collaboration between Stanford University, Texas A&M University, UC San Diego, & Lambda paper introduces **AGENTFLOW**, a no...

Less is More: Recursive Reasoning with Tiny Networks

10 Oct 2025

Contributed by Lukas

This October 6, 2025 paper from Alexia Jolicoeur-Martineau at Samsung SAIL Montréal, provides an overview and detailed comparison of two recurrent re...

Early Experience for Language Agent Improvement

10 Oct 2025

Contributed by Lukas

This October 10, 2025 joint collaboration between Meta Superintelligence Labs, FAIR at Meta, and The Ohio State University academic paper proposes and...

Petri: Accelerating AI Safety Auditing

10 Oct 2025

Contributed by Lukas

On October 6, 2925 Anthropic introduces **Petri (Parallel Exploration Tool for Risky Interactions)**, an open-source framework developed for automated...

Agentic Context Engineering: Evolving Contexts for Self-Improving LLMs

10 Oct 2025

Contributed by Lukas

The October 6, 2025 paper introduces **Agentic Context Engineering (ACE)**, a novel framework designed to enhance the performance of Large Language Mo...

CLUE: Hidden-State Clustering for Non-parametric Verification

10 Oct 2025

Contributed by Lukas

The October 2, 2025 technical report from **Tencent AI Lab** introduces **CLUE (Clustering and Experience-based Verification)**, a novel, non-parametr...

Low-Precision Transformer Failure in Flash Attention

10 Oct 2025

Contributed by Lukas

This October 5 2025 paper presents the first mechanistic explanation for a persistent **training instability** experienced when using **low-precision ...

Paris: Decentralized Open-Weight Diffusion Model

08 Oct 2025

Contributed by Lukas

The October 2025 paper introduces **Paris**, a novel open-weight diffusion model for text-to-image generation that was trained using a completely **de...

DC-VideoGen: Efficient Video Generation with Deep Compression

08 Oct 2025

Contributed by Lukas

The September 29 2025 paper introduces **DC-VideoGen**, a new post-training framework designed to significantly accelerate video diffusion models and ...

GNN101: Visual Learning of Graph Neural Networks

08 Oct 2025

Contributed by Lukas

The November 2024 paper introduces **GNN101**, an open-source, web-based interactive visualization tool designed to help non-experts learn about **Gra...

Reactive Transformer: Stateful Real-Time Language Models

08 Oct 2025

Contributed by Lukas

The October 2025 paper introduces the **Reactive Transformer (RxT)**, a novel neural network architecture designed by Adam Filipek and Reactive AI to ...

Imperceptible Jailbreaking Against Large Language Models

08 Oct 2025

Contributed by Lukas

The October 2025 academic paper introduces a novel **imperceptible jailbreaking attack** against Large Language Models (LLMs) that exploits Unicode **...

ACON: Optimizing Context Compression for LLM Agents

08 Oct 2025

Contributed by Lukas

The October 2025 papar provide an overview of **Agent Context Optimization (ACON)**, a novel framework designed to enhance the efficiency and performa...

CoDA: Collaborative Multi-Agent Data Visualization

08 Oct 2025

Contributed by Lukas

The October 2025 paper introduces **CoDA (Collaborative Data-visualization Agents)**, a novel multi-agent system designed to automate complex data vis...

RECAP: Safety Alignment via Counter-Aligned Prefilling

08 Oct 2025

Contributed by Lukas

The October 2025 academic paper introduces **RECAP (Robust Safety Alignment via Counter-Aligned Prefilling)**, a novel reinforcement learning (RL) met...

ONNX Ecosystem, Optimization, and Deployment

08 Oct 2025

Contributed by Lukas

The provided sources center on the **Open Neural Network Exchange (ONNX)** format and its inference engine, **ONNX Runtime**, highlighting their role ...

Emergent Abilities of Large Language Models

08 Oct 2025

Contributed by Lukas

The sources (October 2022, March 2025) provide an extensive examination of **emergent abilities** in large language models (LLMs), defining them as un...

Implicit Dynamics of In-Context Learning

08 Oct 2025

Contributed by Lukas

This July 2025 research paper explores **In-Context Learning (ICL)** in Large Language Models (LLMs), which is the striking ability of these models to...

Contextual Blocks: Implicit Weight Updates and Federated Learning

08 Oct 2025

Contributed by Lukas

We compare and contrast the math behind two recent research papers which we have covered individually before on this podcast:July 2025:Learning withou...

MotionRAG: Retrieval-Augmented Image-to-Video Generation

08 Oct 2025

Contributed by Lukas

The September 2025 paper introduces **MotionRAG**, a novel retrieval-augmented framework designed to enhance motion realism in image-to-video generati...

NIST Evaluation of DeepSeek AI Models

08 Oct 2025

Contributed by Lukas

The provided text is an excerpt from a **technical evaluation report** conducted by the Center for AI Standards and Innovation (CAISI), housed within ...

Test-Time Reinforcement Learning for LLMs

08 Oct 2025

Contributed by Lukas

This June 2025 paper introduces a novel methodology called **Test-Time Reinforcement Learning (TTRL)**, which enables Large Language Models (LLMs) to ...

LongCodeZip: Compress Long Code Context for LLMs

08 Oct 2025

Contributed by Lukas

The October 2025 paper introduces **LongCodeZip**, a novel, training-free, and model-agnostic framework designed for **compressing long code contexts*...

ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory

08 Oct 2025

Contributed by Lukas

The September 2025 paper introduces **ReasoningBank**, a novel memory framework designed to enhance Large Language Model (LLM) agents by distilling an...

Analog In-Memory Attention for Energy-Efficient LLMs

08 Oct 2025

Contributed by Lukas

Thus November 2024 paper and new analysis in September 2025 provide a comprehensive overview of a novel **Analog In-Memory Computing (AIMC)** architec...

Regression Language Models for Code Metrics

03 Oct 2025

Contributed by Lukas

This September 30 2025 academic paper, introduces Regression Language Models (RLMs) as a unified method for code-to-metric regression, which is the ta...

Introducing RTEB: Retrieval Embedding Benchmark

03 Oct 2025

Contributed by Lukas

The text introduces the **Retrieval Embedding Benchmark (RTEB)**, a new standard designed to accurately evaluate the **retrieval accuracy of embedding...

CUDA Unified Memory and Heterogeneous Memory Management

02 Oct 2025

Contributed by Lukas

The provided sources offer a comprehensive look at memory management for GPU-accelerated computing, focusing heavily on **Heterogeneous Memory Managem...

Moravec's Paradox and AI Automation Limits

01 Oct 2025

Contributed by Lukas

These two 2025 research papers collaboratively examine **Moravec's Paradox**, which posits that skills effortless for humans (like perception and mobi...

Characterizing LLM KV Cache Workloads in Production

01 Oct 2025

Contributed by Lukas

The June 2025 paper characterizes and optimizes the **Key-Value Cache (KV$)** workload patterns associated with serving large language models (LLMs) a...

BurstGPT: A Real-World LLM Serving Workload Dataset

01 Oct 2025

Contributed by Lukas

The May 2025 academic paper introduces **BurstGPT**, a novel, real-world workload dataset consisting of over ten million traces from regional Azure Op...

Qwen3-Next & Qwen3-Omni technical report

30 Sep 2025

Contributed by Lukas

These May and September 2025 technical reports introduce and evaluate two distinct but related large language models: the **Qwen3 family** and the **Q...

Variational Reasoning Framework for Language Models

29 Sep 2025

Contributed by Lukas

This September 26 2025 paper is an excerpt from a research paper introducing a variational reasoning framework designed to enhance the reasoning cap...

Federated Learning with Soft Embeddings for Retrieval

27 Sep 2025

Contributed by Lukas

This September 20 2025 paper introduce a novel, efficient architecture for training **retrieval models** used in retrieval-augmented generation (RAG) ...

Schoenfeld Theory Applied to Large Reasoning Models

27 Sep 2025

Contributed by Lukas

This September 18 2025 paper introduces a research project that applies **Schoenfeld’s Episode Theory**, a classic cognitive framework for analyzing...

CWM: Code Generation with World Models

27 Sep 2025

Contributed by Lukas

This Meta September 24 2025 paper provides an extensive overview of **Code World Model (CWM)**, a 32-billion-parameter dense decoder-only Transformer ...

EmbeddingGemma: Powerful Lightweight Text Representations

26 Sep 2025

Contributed by Lukas

The September 24 2025 paper introduces **EmbeddingGemma**, a novel, lightweight text embedding model developed by **Google DeepMind**, built upon the ...

CE-GPPO: Controlling Entropy via Gradient-Preserving Policy Optimization

26 Sep 2025

Contributed by Lukas

The September 25 2035 paper introduces a novel reinforcement learning (RL) algorithm, **Controlling Entropy via Gradient-Preserving Policy Optimizatio...

Seedream 4.0: Multimodal Image Generation System

26 Sep 2025

Contributed by Lukas

The September 24 2025 paper is a technical report from **ByteDance Seed** detailing the **Seedream 4.0** system, an advanced multimodal image generati...

Tree-based Group Policy Optimization for LLM Agents

26 Sep 2025

Contributed by Lukas

The September 25 2025 paper introduces **Tree-based Group Relative Policy Optimization (Tree-GRPO)**, a new reinforcement learning (RL) method designe...

GDPval: Measuring AI Performance on Real-World Work

26 Sep 2025

Contributed by Lukas

The September 25 2025 dated sources introduce **GDPval**, a novel benchmark created by OpenAI to evaluate the performance of **AI models** on **econom...

Adaptive Compression Techniques for Efficient LLM Inference

20 Sep 2025

Contributed by Lukas

These 14 research papers provide an overview of various **compression techniques for Large Language Models (LLMs)**, primarily focusing on **reducing ...

LLM-I: Interleaved Multimodal Creators via Tool-Use

20 Sep 2025

Contributed by Lukas

The September 2025 academic paper introduces **LLM-Interleaved (LLM-I)**, a novel, flexible framework for interleaved image-text generation that refra...

Evolving Language Models Without Labels: EVOL-RL

19 Sep 2025

Contributed by Lukas

This September 2025 paper source is a research paper from Tencent AI Lab and academic collaborators that introduces EVOL-RL, an Evolution-Oriented ...

SearchInstruct: Instruction Tuning with Dynamic Retrieval

19 Sep 2025

Contributed by Lukas

This September 2025 paper introduces SearchInstruct, a novel framework designed to enhance Supervised Fine-Tuning (SFT) of large language models (LLMs...

THOR: Hierarchical RL for Mathematical Reasoning

19 Sep 2025

Contributed by Lukas

This September 2025 paper describes THOR (Tool-Integrated Hierarchical Optimization via RL), a novel approach designed to enhance the mathematical re...

The Uneven Diffusion of AI Adoption

19 Sep 2025

Contributed by Lukas

The "Anthropic Economic Index report" documents the rapid and uneven adoption of Artificial Intelligence (AI), specifically using data from the compan...

FlowRL: Distribution Matching for LLM Reasoning

19 Sep 2025

Contributed by Lukas

This September 2025 paper introduces FlowRL, a novel reinforcement learning (RL) algorithm for large language models (LLMs) that shifts the optimizat...

Single-stream Policy Optimization for LLMs

19 Sep 2025

Contributed by Lukas

This September 2025 paper introduces Single-stream Policy Optimization (SPO), a new reinforcement learning algorithm for training Large Language Mode...

Pre-computing & reusing KV caches to accelerate RAG inference

18 Sep 2025

Contributed by Lukas

How can pre-computing and reusing Key-Value (KV) caches accelerate inference for Retrieval-Augmented Generation and other long-context LLM tasks?The p...

REFRAG: Rethinking RAG-based Decoding

18 Sep 2025

Contributed by Lukas

This September 2025 academic paper, titled "REFRAG: Rethinking RAG based Decoding," appears on the alphaXiv pre-print server. It focuses on Reframing ...

DeepSeek-R1: Reinforcing LLM Reasoning Through Self-Evolution

18 Sep 2025

Contributed by Lukas

This paper published on Nature on September 17 2025, "DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning," details the develop...

ShadowKV: High-Throughput Long-Context LLM Inference

17 Sep 2025

Contributed by Lukas

This April 2025 paper introduces ShadowKV, an innovative inference system for long-context Large Language Models (LLMs) designed to significantly e...

TailorKV: Hybrid KV Cache Compression for LLMs

17 Sep 2025

Contributed by Lukas

This May 2025 paper introduces TailorKV, a novel hybrid framework designed to optimize Key-Value (KV) cache management in large language models (LLMs)...

MIRAGE: Optimizing LLM KV Cache with Parameter Remapping

17 Sep 2025

Contributed by Lukas

This July 2025 paper discusses advanced memory optimization techniques for Large Language Models (LLMs), particularly focusing on KV cache managemen...

WebSailor-V2: Bridging Proprietary Agents with Synthetic Data and RL

17 Sep 2025

Contributed by Lukas

This September 2025 paper introduces WebSailor-V2, an open-source deep research agent developed by Alibaba Group's Tongyi Lab. The paper details a ...

Dynamic Chunking for Hierarchical Sequence Modeling

17 Sep 2025

Contributed by Lukas

This July 2025 paper introduces Hierarchical Networks (H-Nets), a novel architecture designed to move beyond traditional tokenization in large langua...

LoFT: Parameter-Efficient Fine-Tuning for Long-tailed Semi-Supervised Learning

17 Sep 2025

Contributed by Lukas

This September 2025 paper introduces LoFT, a novel framework designed to improve Long-Tailed Semi-Supervised Learning (LTSSL) by leveraging paramet...

QuantAgent: Multi-Agent LLM for High-Frequency Trading

17 Sep 2025

Contributed by Lukas

This September 2025 paper describes QuantAgent, a novel multi-agent large language model (LLM) framework designed for high-frequency quantitative tra...

Infini-gram: Scaling Unbounded N-gram Language Models

17 Sep 2025

Contributed by Lukas

This April 2025 paper introduces Infini-gram, a novel engine designed to scale n-gram language models to an unprecedented 5 trillion tokens and sup...

Generalist Reward Modeling with Inference-Time Scaling

16 Sep 2025

Contributed by Lukas

This April 2025 paper introduces Self-Principled Critique Tuning (SPCT), a novel method designed to enhance the inference-time scalability of Gene...

Hierarchical Reasoning Model: Brain-Inspired AI for Complex Tasks

16 Sep 2025

Contributed by Lukas

This August 2025 paper introduces the Hierarchical Reasoning Model (HRM), a novel AI architecture inspired by the human brain's hierarchical and mult...

Native Sparse Attention: Efficient Long-Context LLMs

16 Sep 2025

Contributed by Lukas

This February 2025 paper introduces Native Sparse Attention (NSA), a novel approach to address the computational demands of long-context modeling in ...

CodeI/O: Reasoning Patterns Through Code Input-Output Prediction

16 Sep 2025

Contributed by Lukas

This February 2025 paper introduce CodeI/O, a novel training method for Large Language Models (LLMs) that enhances general reasoning abilities by t...

Janus-Pro: Unified Multimodal AI with Scaled Improvements

16 Sep 2025

Contributed by Lukas

This January 2025 paper introduces Janus-Pro, an enhanced artificial intelligence model for multimodal understanding and generation. It builds upon ...

Federated Post-Training LLMs: An Accessibility and Efficiency Survey

16 Sep 2025

Contributed by Lukas

This August 2025 paper examines the evolving landscape of Federated Large Language Models (FedLLM), focusing on how large language models are post-t...

Non-Penetrative Tensor Partitioning for Collaborative AIoT Inference

16 Sep 2025

Contributed by Lukas

This June 2025 paper introduces Non-Penetrative Tensor Partitioning (NPTP), a novel method designed to improve the speed of collaborative inference fo...

Collaborative Edge Inference with Dynamic Task Offloading and Early Exiting

16 Sep 2025

Contributed by Lukas

This December 2024 paper introduces a collaborative inference framework designed for large-scale models in 5G smart city edge computing environmen...

Adaptive LLM Partitioning for Edge Inference

16 Sep 2025

Contributed by Lukas

This May 2025 paper introduces a resource-aware algorithm designed to optimize the performance of Large Language Models (LLMs) for low-latency inferen...

UQ: Unsolved Questions for Language Models

16 Sep 2025

Contributed by Lukas

This August 2025 paper introduces UQ, a novel evaluation framework designed to challenge large language models (LLMs) with complex, unsolved questions...

«« ← Prev Page 3 of 5 Next → »»