Menu
Sign In Search Podcasts Charts People & Topics Add Podcast API Pricing
Podcast Image

AI: post transformers

Language: en Technology
Last Checked: 2025-12-10 00:38:13.712526
Showing episodes 1 to 100 of 340 total
«« ← Previous Page 1 of 4 Next → »»
Jump to:

PageANN: Scalable Disk ANNS with Page-Aligned Graphs

07 Dec 2025

Contributed by Lukas

The research paper presents PageANN, a novel framework engineered to overcome the severe latency and...

NeurIPS 2025: Homogeneous Keys, Heterogeneous Values

04 Dec 2025

Contributed by Lukas

This research presents a novel method for efficient long-context modeling in Large Language Models (...

NeurIPS 2025: Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

29 Nov 2025

Contributed by Lukas

The research systematically investigates the effects of integrating various gating mechanisms into t...

NeurIPS 2025: Large Language Diffusion Models

29 Nov 2025

Contributed by Lukas

This research paper introduces LLaDA, an 8-billion parameter language model based on the masked diff...

NeurIPS 2025: Reinforcement Learning for Reasoning in Large Language Models with One Training Example

29 Nov 2025

Contributed by Lukas

This research examines the data efficiency of Reinforcement Learning with Verifiable Reward (RLVR) w...

NeurIPS 2025: Parallel Scaling Law for Language Models

29 Nov 2025

Contributed by Lukas

The research proposes Parallel Scaling (PARSCALE) as a novel, efficient strategy to enhance Large La...

NeurIPS 2025: SeRL: Self-Play Reinforcement Learning for Large Language Models with Limited Data

29 Nov 2025

Contributed by Lukas

The academic paper introduces Self-play Reinforcement Learning (SeRL), a framework engineered to enh...

NeurIPS 2025: DYNAACT: Large Language Model Reasoning with Dynamic Action Spaces

29 Nov 2025

Contributed by Lukas

The provided text outlines DYNAACT, a new framework intended to enhance sequential reasoning in Larg...

NeurIPS 2025: KGGen: Extracting Knowledge Graphs from Plain Text with Language Models

29 Nov 2025

Contributed by Lukas

The academic paper introduces KGGen, a novel text-to-knowledge-graph generator designed to overcome ...

NeurIPS 2025: Self-Adapting Language Models

29 Nov 2025

Contributed by Lukas

The academic paper presents the Self-Adapting LLM (SEAL) framework, designed to allow large language...

NeurIPS 2025: Thinkless: LLM Learns When to Think

29 Nov 2025

Contributed by Lukas

The research introduces Thinkless, a framework designed to solve the computational inefficiency of L...

NeurIPS 2025: FlashBias: Fast Computation of Attention with Bias

29 Nov 2025

Contributed by Lukas

The source introduces FlashBias, an innovative algorithm designed to significantly accelerate the ef...

NeurIPS 2025: A-Mem: Agentic Memory for LLM Agents

29 Nov 2025

Contributed by Lukas

The source details the creation and evaluation of Agentic Memory (A-MEM), a novel memory system for ...

NeurIPS 2025: MoBA: Mixture of Block Attention for Long-Context LLMs

29 Nov 2025

Contributed by Lukas

This paper introduces Mixture of Block Attention (MoBA) to address the prohibitive quadratic computa...

NeurIPS 2025: Reward Reasoning Model

29 Nov 2025

Contributed by Lukas

The source details the development and evaluation of Reward Reasoning Models (RRMs), which are desig...

Anthropic: Disrupting the First AI-Orchestrated Cyber Espionage Campaign

27 Nov 2025

Contributed by Lukas

Anthropic released a detailed report outlining the detection and disruption of an advanced cyber esp...

Anthropic: reward hacking & misalignment & sabotage

22 Nov 2025

Contributed by Lukas

Anthropic’s research details how **realistic AI training processes can inadvertently create misali...

DeepSeek-OCR: Contexts Optical Compression

22 Nov 2025

Contributed by Lukas

The October 21, 2025 Deepseek paper introduces **DeepSeek-OCR**, a Vision-Language Model (VLM) desig...

Neuromorphic computing: Brain-Inspired AI and Hardware

22 Nov 2025

Contributed by Lukas

These sources provide a comprehensive overview of **neuromorphic computing (NC)**, focusing heavily ...

Meta: SAM 3

20 Nov 2025

Contributed by Lukas

This Meta November 18 2025 paper details the development, training, and evaluation of **Segment Anyt...

Mamba-360: State Space Models for Long Sequence Modeling

19 Nov 2025

Contributed by Lukas

The April 24, 2024 paper provides a comprehensive **survey of State Space Models (SSMs)**, outlining...

Mixture-of-Depths: Dynamic Compute Allocation in Transformers

19 Nov 2025

Contributed by Lukas

These April 4, 2024 Google Deepmind paper introduces the **Mixture-of-Depths (MoD)** transformer arc...

MLP Mixer Models

19 Nov 2025

Contributed by Lukas

These sources collectively explore the **MLP-Mixer architecture** and its numerous extensions across...

Marin: Open LLM Optimization & Diagnostics

19 Nov 2025

Contributed by Lukas

Marin is an open lab dedicated to the transparent research and development of foundation models (FMs...

vAttention Vs Strata: advanced GPU memory management

19 Nov 2025

Contributed by Lukas

We compare and contrast two advanced 2025 memory management and scheduling techniques for optimizing...

AMD: Instella: Fully Open Language Models with Stellar Performance

16 Nov 2025

Contributed by Lukas

The November 13, 2025 paper by AMD introducs **Instella**, a new family of **fully open-source** thr...

Mechanistic interpretability: Decoding the AI's Inner Logic: Circuits and Sparse Features

15 Nov 2025

Contributed by Lukas

Ten different sources are used in this episode which are excerpts from academic papers and technical...

Spectral Gap: Analysis of Attention Layers and Graph Transformers

10 Nov 2025

Contributed by Lukas

We review two papers on Spectral Gap, one 2021 and another from 2025. The first source presents the ...

CARTRIDGE: Efficient In-Context Learning via Distillation

10 Nov 2025

Contributed by Lukas

The June 13, 2025 joint collaboration between Stanford University, Caltech and University at Buffalo...

Metacognition and Skill Discovery in LLM Math Reasoning

10 Nov 2025

Contributed by Lukas

The May 20, 2024 academic paper explores the **metacognitive capabilities of Large Language Models (...

Context Distillation for Language Models

10 Nov 2025

Contributed by Lukas

These five papers from 2022 up to 2025 discuss various **knowledge distillation techniques** aimed a...

Tempo: SLO-Aware LLM Serving Maximizing Service Gain

10 Nov 2025

Contributed by Lukas

The April 24, 2025 academic paper introduces **Tempo**, a novel scheduling system designed to optimi...

LLM-AutoDiff: Auto-Differentiate Any LLM Workflow

10 Nov 2025

Contributed by Lukas

The January 30, 2025 paper introduces **LLM-AutoDiff**, a novel framework for **Automatic Prompt Eng...

Confucius: Intent-Driven Network Management with Multi-Agent LLMs

10 Nov 2025

Contributed by Lukas

The August 27, 2025 paper introduces **Confucius**, a novel multi-agent Large Language Model (LLM) f...

SYMPHONY: Memory Management for LLM Multi-Turn Inference

10 Nov 2025

Contributed by Lukas

The 2024 paper introduces **SYMPHONY**, a novel system designed to improve memory management and sch...

DSPy and TextGrad: Compiling Language Model Systems

10 Nov 2025

Contributed by Lukas

These two academic papers introduce novel programming models aimed at systematically optimizing comp...

Vidur: Simulation for Efficient LLM Inference Deployment

10 Nov 2025

Contributed by Lukas

The May 21, 2024 paper introduces **Vidur**, a new, high-fidelity simulation framework designed to o...

Continuous Autoregressive Language Models: CALM

10 Nov 2025

Contributed by Lukas

The October 31, 2025 paper introduces **Continuous Autoregressive Language Models (CALM)**, a new pa...

A Framework for LLM Application Safety Evaluation

10 Nov 2025

Contributed by Lukas

The July 13, 2025 paper " Measuring What Matters: A Framework for Evaluating Safety Risks in Real-Wo...

Doubly Stochastic Attention for Transformers

10 Nov 2025

Contributed by Lukas

The four papers we review dated from 1967 up to two papers in 2025 collectively discuss the mathemat...

Random Walk Methods for Graph Learning and Networks

10 Nov 2025

Contributed by Lukas

We provide a review of the evolution of value of Page Rank to Random Walk with Random Restart and it...

AlphaEvolve: Mathematical Discovery at Scale

10 Nov 2025

Contributed by Lukas

The November 3, 2025 paper provide an overview of the **AlphaEvolve** system, an AI-powered evolutio...

AdaFlow: Variance-Adaptive Flow-Based Imitation Learning

10 Nov 2025

Contributed by Lukas

The November 22, 2024 paper from UT Texas introduces **AdaFlow**, a novel imitation learning framewo...

zFLoRA: Zero-Latency Fused Low-Rank Adapters

04 Nov 2025

Contributed by Lukas

The October 28, 2025 Samsung research paper introduces **zFLoRA (zero-latency fused low-rank adapter...

SuperBPE: Space Travel for Language Models

04 Nov 2025

Contributed by Lukas

The August 26, 2025 collaboration between the University of Washington, NVIDIA and the Allen Institu...

Google: Supervised Reinforcement Learning for Step-wise Reasoning in LLMs

04 Nov 2025

Contributed by Lukas

The October 29 2025 Google research paper introduces **Supervised Reinforcement Learning (SRL)**, a ...

MorphKV: Constant-Sized KV Caches for LLM Inference

04 Nov 2025

Contributed by Lukas

The June 7, 2025 UT Austin and University of British Colombia collaboration academic paper introduce...

HALoS: Hierarchical Asynchronous LLM Training over Slow Networks

04 Nov 2025

Contributed by Lukas

The June 5, 2025 research paper introducing **HALoS: Hierarchical Asynchronous Local SGD**, a novel ...

Anchored Diffusion Language Model: Superior Generation and Reasoning

04 Nov 2025

Contributed by Lukas

The May 24, 2025 UT Austin paper introduces the **Anchored Diffusion Language Model (ADLM)**, a nove...

Gumbel-Softmax for Differentiable Categorical Reparameterization and Selective Networks

04 Nov 2025

Contributed by Lukas

These two papers (years 2017, 2022) introduce and then apply the **Gumbel-Softmax distribution** as ...

PolicySmith: Automated Systems Heuristic Generation via LLMs

04 Nov 2025

Contributed by Lukas

The October 9, 2025 paper from UT Austin paper introduces **PolicySmith**, a novel framework that au...

RetNet: Retentive Networks: Transformer Successor for Large Language Models

02 Nov 2025

Contributed by Lukas

The August 9, 2023 paper introduces the **Retentive Network (RetNet)**, a proposed foundational arch...

Kimi Linear: Efficient Expressive Attention Architecture

02 Nov 2025

Contributed by Lukas

The October 30, 2025 **technical report** details the development and evaluation of **Kimi Linear**,...

ALiBi: Attention with Linear Biases Enables Length Extrapolation

01 Nov 2025

Contributed by Lukas

The April 22, 2022 collaboration between University of Washington, Facebook AI and the Allen Institu...

Quest: Query-Aware Sparsity for Efficient LLM Inference

31 Oct 2025

Contributed by Lukas

The August 26, 2024 academic paper introduces **Quest**, a novel algorithm designed to improve the i...

Flash-LLM: Efficient LLM Inference with Unstructured Sparsity on Tensor Cores

31 Oct 2025

Contributed by Lukas

The September 19, 2025 Alibaba paper introduces **Flash-LLM**, a novel software framework designed t...

ELASTIC: Linear Attention for Sequential Interest Compression

31 Oct 2025

Contributed by Lukas

The February 12, 2025 KuaiShou Inc paper introduces **ELASTIC**, an Efficient Linear Attention for S...

Anthropic: Introspective Awareness in LLMs

31 Oct 2025

Contributed by Lukas

On October 29, 2025 Anthropic presented research investigating the existence of **functional introsp...

Small Versus Large Models for Requirements Classification

31 Oct 2025

Contributed by Lukas

The October 24, 2025 collaboration between many universities have published a paper thst compares th...

Hyper-Scaling LLM Inference with KV Cache Compression

31 Oct 2025

Contributed by Lukas

The June 5, 2025 collaboration between University of Edinburgh and Nvidia paper introduces the conce...

Architectural Scaling Laws for Efficient LLMs

31 Oct 2025

Contributed by Lukas

The October 21, 2025 collaboration paper between UW-Madison and Amazon Web Services discuss the crit...

ATTENTION2D and lean attention: Distributed Self-Attention

29 Oct 2025

Contributed by Lukas

We cover two new innovations from Microsoft extending ideas from the original old **FlashAttention**...

Sentence-BERT: Siamese Networks for Sentence Embeddings

29 Oct 2025

Contributed by Lukas

The provided text introduces **Sentence-BERT (SBERT)**, a modification of the popular **BERT** and *...

TxGNN: Foundation Model for Zero-Shot Drug Repurposing

29 Oct 2025

Contributed by Lukas

The source provides excerpts from a scientific paper introducing **TxGNN**, a novel graph foundation...

STAR: Sub-Entry Sharing TLB for Multi-Instance GPU Efficiency

26 Oct 2025

Contributed by Lukas

These April 29, 2024 paper provides an overview of the challenges associated with using **NVIDIA's M...

Strata: Efficient Hierarchical Context Caching for LLM Serving

26 Oct 2025

Contributed by Lukas

The August 26, 2025 collaboration between Stanford, NVIDIA, Shanghai Jiao Tong University, Universit...

FlashAttention: IO-Aware Fast and Memory-Efficient Attention

26 Oct 2025

Contributed by Lukas

This is a classic review of a now old but yet still important paper, the original Flash Attention pa...

Introducing MTEB v2: Multimodal Embedding Evaluation

26 Oct 2025

Contributed by Lukas

On October 20, 2025 Hugging Face released **MTEB v2**, a significant refactoring of the Massive Text...

Structural Understanding of LLM Overthinking

26 Oct 2025

Contributed by Lukas

The October 10, 2025 paper from the University of Michigan and **Google DeepMind** concerning the ph...

Stuck in the Matrix: LLM Spatial Reasoning

26 Oct 2025

Contributed by Lukas

The October 23 2025 research paper **probes the spatial reasoning capabilities of Large Language Mod...

LLM-Empowered Knowledge Graph Construction: A Survey

26 Oct 2025

Contributed by Lukas

This October 23, 2025 Xidian University academic survey systematically reviews the transformative im...

Survey of Emerging Topics in AI and Robotics

26 Oct 2025

Contributed by Lukas

The October 23, 2025 collaboration between UC San Diego , NVIDIA , META , UW-Madison , and UNC intro...

The Free Transformer: VAE Extension for Decoders

26 Oct 2025

Contributed by Lukas

The October 20, 2025 Meta FAIR paper introduces the **Free Transformer**, an innovative extension of...

LithOS: Operating System for Efficient GPU Machine Learning

26 Oct 2025

Contributed by Lukas

This 2025 CMU paper introduces **LithOS**, a novel operating system designed to improve the efficien...

Ring-linear: Efficient Hybrid Architecture for Long-Context Reasoning

26 Oct 2025

Contributed by Lukas

This October 23, 2025 technical report from the Ling Team introduces the **Ring-linear model series*...

GigaBrain-0: World Model-Powered Generalist Robots

26 Oct 2025

Contributed by Lukas

The October 22, 2025 GigaAI paper introduces **GigaBrain-0**, a novel Vision-Language-Action (VLA) m...

Open-o3 Video: Spatio-Temporal Grounded Reasoning

26 Oct 2025

Contributed by Lukas

The October 25, 2025 Bytedance paper introduces **Open-o3 Video**, a novel framework developed by re...

Cattell–Horn–Carroll Theory of Intelligence

26 Oct 2025

Contributed by Lukas

We review the Cattell-Horn-Carroll (CHC) used in recent AI papers on the definition of what AGI coul...

Internal Mechanisms of a Large Language Model

26 Oct 2025

Contributed by Lukas

This March 27, 2025 Anthropic paper provides an overview and detailed excerpts from two related Anth...

Latent Constituency in Humans and LLMs

26 Oct 2025

Contributed by Lukas

The provided text is an academic paper titled **"Active Use of Latent Constituency Representation in...

Cognitive Impact of AI and Search on Essay Writing

26 Oct 2025

Contributed by Lukas

The June 2025 paper presents excerpts from a study examining the **cognitive and performance differe...

LFM2-8B-A1B: Efficient On-Device Mixture-of-Experts

26 Oct 2025

Contributed by Lukas

The October 7, 2025 technical release by Liquid AI introducing their new model, **LFM2-8B-A1B**, an ...

MASA: Meta-Awareness via Self-Alignment Reinforcement Learning

26 Oct 2025

Contributed by Lukas

The September 26, 2025 paper introduces a novel reinforcement learning framework called **Meta-Aware...

LLMs Learning from Verbal Feedback Without Scalar Rewards

26 Oct 2025

Contributed by Lukas

The September 25, 2025 collaboration between Sea AI Lab, SUTD, NUS, NTU and University of Waterloo p...

Lp-Reg: Low-Probability Tokens Sustain RL Exploration

26 Oct 2025

Contributed by Lukas

The October 3, 2025 paper by Tencent introduces a reinforcement learning technique called **Low-prob...

REFRAG: v2 paper: Efficient RAG Decoding via Context Compression

22 Oct 2025

Contributed by Lukas

The Meta Superintelligence Labs team in collaboration with Rice University and National University o...

RoBERTa: Robustly Optimized BERT Pretraining Approach

22 Oct 2025

Contributed by Lukas

The July 2019 paper introduces **RoBERTa**, a **robustly optimized BERT pretraining approach**, whic...

LightMem: Lightweight Efficient Memory-Augmented Generation

22 Oct 2025

Contributed by Lukas

The October 21, 2025 academic paper introduces **LightMem**, a novel and efficient memory-augmented ...

RAG-Anything: Unified Multimodal Knowledge Retrieval Framework

22 Oct 2025

Contributed by Lukas

The October 14, 2025 paper introduxes **RAG-Anything**, a novel and unified framework for **Retrieva...

Elastic-Cache: Adaptive KV Caching for Diffusion LLMs

22 Oct 2025

Contributed by Lukas

The October 16, 2025 academic paper introduces **Elastic-Cache**, an innovative, training-free strat...

LLM-Guided Hierarchical Retrieval: The LATTICE Framework

22 Oct 2025

Contributed by Lukas

The October 15, 2025 paper details a novel information retrieval framework called **LATTICE**, which...

In-Context Learning as Implicit Learning Algorithms

22 Oct 2025

Contributed by Lukas

The May 17, 2023 academic paper explores the nature of **in-context learning (ICL)** in neural seque...

Dr.LLM: Dynamic Layer Routing in LLMs

22 Oct 2025

Contributed by Lukas

The October 14, 2025 paper is an excerpt from a research paper introducing **Dr.LLM**, a novel, retr...

A Psychometric Framework for Artificial General Intelligence

22 Oct 2025

Contributed by Lukas

This large collaboration between 29 different institutions proposes a quantifiable framework for def...

EssenceBench: Compressing LLM Benchmarks via Redundancy and Genetic Algorithm

22 Oct 2025

Contributed by Lukas

The October 12, 2025 paper introduces **EssenceBench**, a novel methodology for **compressing large ...

Inheritune: Efficient LLM Training via Attention Collapse

22 Oct 2025

Contributed by Lukas

This June 8, 2025 collaboration between University of Texas and NYU paper describes a newly identifi...

Structural Understanding of LLM Overthinking

22 Oct 2025

Contributed by Lukas

The October 10, 2025 academic paper from Google DeepMind and the University of Michigan investigates...

Geometric Flows of Logic in LLM Representation Space

18 Oct 2025

Contributed by Lukas

The October 10, 2025 Duke University academic paper introduces a **novel geometric framework** that ...

Mojo: Performance-Portable HPC Kernels on GPUs

18 Oct 2025

Contributed by Lukas

The September 25 2025 academic paper **evaluates the performance and portability** of the novel **Mo...

Scaling Reinforcement Learning Compute for LLMs

17 Oct 2025

Contributed by Lukas

This October 15, 2025 collaboration between Meta, UT Austin, UCL, UC Berkeley, Harvard University, a...

«« ← Previous Page 1 of 4 Next → »»
Jump to: