Dwarkesh Podcast

Adam Marblestone – AI is missing something fundamental about the brain

12/30/2025

Editorial Note

<a target="_blank" href="https://twitter.com/AdamMarblestone">Adam Marblestone</a> is CEO of <a target="_blank" href="https://www.convergentresearch.org/">Convergent Research</a>. He’s had a very interesting past life: he was a research scientist at Google Deepmind on their neuroscience team and has worked on everything from brain-computer interfaces to quantum computing to nanotech and even formal mathematics.In this episode, we discuss how the brain learns so much from so little, what the AI field can learn from neuroscience, and the answer to Ilya’s question: how does the genome encode abstract reward functions? Turns out, they’re all the same question.Watch on <a target="_blank" href="https://youtu.be/_9V_Hbe-N1A">YouTube</a>; read the <a target="_blank" href="https://www.dwarkesh.com/p/adam-marblestone">transcript</a>.Sponsors* <a target="_blank" href="https://gemini.google.com">Gemini 3 Pro</a> recently helped me run an experiment to test multi-agent scaling: basically, if you have a fixed budget of compute, what is the optimal way to split it up across agents? Gemini was my colleague throughout the process — honestly, I couldn’t have investigated this question without it. Try Gemini 3 Pro today <a target="_blank" href="https://gemini.google.com">gemini.google.com</a>* <a target="_blank" href="https://labelbox.com/dwarkesh">Labelbox</a> helps you train agents to do economically-valuable, real-world tasks. Labelbox’s network of subject-matter experts ensures you get hyper-realistic RL environments, and their custom tooling lets you generate the highest-quality training data possible from those environments. Learn more at <a target="_blank" href="https://labelbox.com/dwarkesh">labelbox.com/dwarkesh</a>To sponsor a future episode, visit <a target="_blank" href="https://www.dwarkesh.com/advertise">dwarkesh.com/advertise</a>.Timestamps(00:00:00) – The brain’s secret sauce is the reward functions, not the architecture(00:22:20) – Amortized inference and what the genome actually stores(00:42:42) – Model-based vs model-free RL in the brain(00:50:31) – Is biological hardware a limitation or an advantage?(01:03:59) – Why a map of the human brain is important(01:23:28) – What value will automating math have?(01:38:18) – Architecture of the brainFurther reading<a target="_blank" href="https://www.lesswrong.com/s/HzcM2dkCq7fwXBej8">Intro to Brain-Like-AGI Safety</a> - Steven Byrnes’s theory of the learning vs steering subsystem; referenced throughout the episode.<a target="_blank" href="https://www.abriefhistoryofintelligence.com/book">A Brief History of Intelligence</a> - Great book by Max Bennett on connections between neuroscience and AIAdam’s <a target="_blank" href="https://longitudinal.blog/">blog</a>, and Convergent Research’s <a target="_blank" href="https://www.essentialtechnology.blog/">blog on essential technologies</a>.<a target="_blank" href="http://yann.lecun.com/exdb/publis/pdf/lecun-06.pdf">A Tutorial on Energy-Based Learning</a> by Yann LeCun<a target="_blank" href="https://arxiv.org/abs/1907.06374">What Does It Mean to Understand a Neural Network?</a> - Kording & Lillicrap<a target="_blank" href="https://www.e11.bio/">E11 Bio</a> and their brain connectomics approachSam Gershman on <a target="_blank" href="https://gershmanlab.com/pubs/GershmanUchida19.pdf">what dopamine is doing in the brain</a><a target="_blank" href="https://www.reddit.com/r/reinforcementlearning/comments/9pwy2f/wbe_and_drl_a_middle_way_of_imitation_learning/">Gwern’s proposal</a> on training models on the brain’s hidden states Get full access to Dwarkesh Podcast at <a href="https://www.dwarkesh.com/subscribe?utm_medium=podcast&utm_campaign=CTA_4">www.dwarkesh.com/subscribe</a>

Voices

Keywords

Adam MarblestoneDwarkesh PatelAmortized inferenceReinforcement LearningModel-based RLModel-free RLReward functionsGenomeConnectomeNeural circuitsBasal gangliaPrefrontal cortexCerebral cortexHebbian learningBackpropagationMeta-learningSynaptic plasticityOptogeneticsNeuromorphic computingStochastic gradient descentEvolutionary priorsInductive biasesHomeostasisDopaminergicDendritic computationNeurobiologyIn-context learningBiological hardwareCognitive architectureComputational neuroscience

Chapter 01Read Full

Insight

Chapter 01

Insight

The Quadrillion-Dollar Question

Why do LLMs need oceans of data to achieve a fraction of human capability? We’ve obsessed over architecture, but we might be looking at the wrong map. The secret isn't the wires—it's the reward.

Evolution as a Python Script

Machine learning loves mathematically elegant loss functions. Predict the next token. Minimize cross-entropy. It’s clean, it's simple, and it's probably not how we work.

My hunch? Evolution built immense complexity into our loss functions, not just our architecture. Imagine a massive, ancient codebase—thousands of lines of "Python code" generating a specific curriculum for every stage of development. Evolution has seen what works for millions of years; it doesn't leave the learning to chance. It encodes the knowledge of the curriculum itself.

The World Model

The Cortex

An omnidirectional prediction engine. Unlike an LLM that only looks forward, the cortex is natively designed to fill in any blank. It predicts vision from audition, muscle tension from abstract thoughts. It’s the ultimate association machine.

The Evolutionary Hard-Drive

The Steering Subsystem

The "Lizard Brain." It has its own primitive sensors (like the superior colliculus) to detect faces or movement instantly, triggering reflexes like shame or fear before the cortex even knows why.

"How does the brain encode high-level desires? Evolution never saw Jan LeCun or a podcast. How does it know to make me feel 'shame' if I misinterpret his energy-based models?"

"It’s about Predictive Steering. The cortex learns to predict when the 'Lizard Brain' is about to flinch. When you hear 'there is a spider on your back,' your cortex generalizes the concept and triggers the subcortical alarm. You've wired an abstract word to an ancient reflex."

"We aren't just predicting the next token; we are predicting our own biological reactions to the world."

The AlphaZero Efficiency Paradox

I ran a "Vibe Coding" experiment with Gemini. We tested a radical idea: Is it better to train one massive agent, or a diverse population of "smaller" agents with the same total budget?

"The best agent in a population of 16—having received only 1/16th of the compute—outperformed the single agent hogging the entire budget."

Data represents relative win-rate in multi-agent self-play environments.

Current: Reward Functions

Next: Amortized Inference & The Genome

The Amortization Shift

If the brain's secret sauce is its reward function, how does it actually execute intelligence in real-time? We're moving from slow, Bayesian "sampling" to the lightning-fast "amortized inference" that defines both modern AI and biological perception.

Right now, models map input to output. But real intelligence is a prior over how the world could be. To calculate every possible cause is computationally intractable. You’d have to sample forever. Is "amortized inference" just skipping the sampling?

Exactly. Bayesian inference is perception’s biggest headache. Neural networks don't start from scratch every time; they bake that "cause-to-observation" logic directly into the feed-forward pass. You don't sample; you just know.

The Genome Bottleneck

There is a massive mystery in biology: if you want to analogize evolution to pre-training, how do you explain the fact that so little information is conveyed through the genome? We’re talking about **three gigabytes**. That’s the total size of the human genome. A tiny fraction of that codes for the brain.

If you were trying to "hard-code" the weights of a trillion-parameter model into 3GB of disk space, you’d fail instantly. So, what is evolution actually storing?

The answer: **The Loss Function.**

In Python, a reward function is literally a line of code. You can have a thousand lines specifying "spider-flinch" or "social-bonding," and it takes up almost no space. Evolution didn't find the weights; it found the hyperparameters and the reward signals that force the brain to learn those weights during a lifetime.

Data Point

3.2 GB

Total Human Genome Size

The Insight

"The reward function is compact. The learning subsystem is a generic eight-layer transformer replicated a million times."

The Diversity of the "Steering" Brain

New single-cell atlases reveal a stark contrast: the cortex (learning) is repetitive and uniform, while the steering regions (reward) are a diverse zoo of bespoke cell types.

In the Learning Subsystem, you're just replicating layers. The "Python code" to make an 8-layer transformer isn't much longer than a 3-layer one. It's scalable and repetitive.

But in the Steering Subsystem, there's a "gazillion" weird cell types. One for the spider-flinch, one for the taste of salt, one for maternal bonding. These are innately wired circuits. They don't learn; they direct learning.

"The cortex doesn't know about spiders. It just knows about layers and gradients. The steering subsystem is where all the bespoke species-specific 'crap' lives."

The Evolutionary Pivot

"We didn't invent a better brain, we just found a better incentive to grow it."

Why did the hominid brain explode in size? It wasn't a breakthrough in cortical architecture. A mouse's cortex and a human's cortex are remarkably similar.

The unlock was **Social Learning**. Evolution tweaked the reward function to value joint eye contact, linguistic cues, and elder-imitation. This increased the "returns" on having a bigger cortex. Once the reward function prioritized social data, the scaling laws took over.

Next: We dive into how this feedback loop differentiates Model-based vs Model-free RL in the human mind.

Previously: Amortized inference and genomic storage

The "Dumb" State of RL

Why current Large Language Models are using the most primitive forms of learning—and still somehow winning.

"It’s kind of crazy that this is working."

Right now, when we train LLMs, if they solve a math problem or pass a unit test, we just up-weight the entire trajectory of tokens. It’s brute force. Even Ilya Sutskever pointed this out on the podcast—it’s weird that we don’t use Value Functions.

Think back ten years to the Atari-playing AI. That was using Q-learning. It had a sense of the long-run consequence of an action. Modern LLMs? They’re optimized for GPUs, not for the conceptual elegance of Reinforcement Learning. We're using the "dumbest" form of RL, yet we’re seeing incredible results.

"NEUROSCIENCE SHOULD BE THE GROUND TRUTH."

The Dual-System Brain

1. The Basal Ganglia (Model-Free)

This is the "dumb" RL in our heads. It has a finite, small action space. It tells the spinal cord: Do this motor action. Yes or no? It's simple, naive, and incredibly fast.

2. The Cortex (Model-Based)

This is the high-order stuff. It builds a world model. It doesn't just react; it predicts. It asks: What types of plans lead to reward in these specific circumstances? It's RL as inference—clamping the "High Reward" variable and sampling the plan that gets us there.

Dopamine & Prediction Error

Neuroscience shows dopamine isn't just "reward"—it's a Reward Prediction Error (RPE) signal. It's the gap between expectation and reality.

Tangential Thought

Culture as a Model-Free Algorithm

Think about Joe Henrich's work on cultural evolution. How does a society figure out that a specific bean is poisonous unless you perform a 10-step cleaning process? No single person sat down and "modeled" the chemical toxicity.

"Culture is like model-free RL happening at a civilizational level. Evolution is the simplest algorithm, and if we believe all of this—us—came from evolution, then simple algorithms can get you anything if you run them long enough."

We have this hierarchy of "Model-Free" vs "Model-Based" systems stacked on top of each other:

Evolution: Model-Free (The Outer Loop)

Basal Ganglia: Model-Free (Motor/Habit)

Cortex: Model-Based (World Modeling)

Culture: Model-Free (Generational Knowledge)

Partner Note

Training Real-World Agents

Speaking of model-free culture and tacit knowledge—some things you just can't learn from a manual. LabelBox provides the expertise and scaffolding to capture that "underwriter’s intuition" for your AI agents.

Learn more at labelbox.com/thwarkash

Is biological hardware a limitation or an advantage? →

The Hardware Paradox

We've spent a lot of time talking about the software of the mind—model-based vs. model-free RL. But what happens when that algorithm is inscribed in meat instead of silicon? Are we smarter because of our biological limitations, or in spite of them?

The Energy Budget

20 Watts.

The brain runs on the power of a dim lightbulb at 200 Hz. To survive, it evolved extreme "unstructured sparsity" and co-located memory and compute.

The Copy Problem

Immutable.

You can't "read-write" a neuron. I can't copy my weight matrix into your head. This lack of random access is a massive biological "fuck you" to scalability.

Cognitive Dexterity

Sampling.

Neurons are naturally stochastic. While Python needs a random number generator, the brain just "is" probabilistic. It's built for inference.

The "Kludge" vs. The Ghost in the Machine

Is the cellular machinery of the brain—all those genetic changes and molecular machines—actually doing algorithmic work, or is it just a messy way to implement weights?

Think about it: In a digital mind, you nudge a parameter. Easy. In a cell, to modulate a synapse according to a gradient signal, you have to talk to the nucleus, send signals back out... it's a massive logistics operation. I tend to think most of that cellular "crazy machinery" is just the infrastructure needed to make the synaptic learning work without a central controller.

But there are exceptions. Look at the cerebellum. It’s incredible at timing—like predicting exactly when a puff of air will hit your eye after a flash of light. It turns out the cell body itself might be storing those time constants. It’s not just a ring of synapses; the hardware is the clock.

"The best computational neuroscience theories we have were invented as AI models first."

— On the irony of reverse-engineering the brain

The AI Perspective

"Does a paperclip maximizer need a social brain? Can you have AGI without the human 'steering subsystem'?"

The Reality Check

"We already know from LLMs that you can learn language without eye contact. But to build spaceships? You need curiosity and exploration. Those are the 'drives' we need to align."

Some neuroscientists, like Yuri Buzaki, think we’re full of it. They argue that our AI vocabulary—"backprop," "weights," "layers"—is just a made-up language we're forcing onto the brain. They want a bottom-up vocabulary based on physical dynamical systems and oscillations.

I say: why not both? We should simulate the zebrafish from the bottom up, but we shouldn't ignore the fact that TD learning—an equation Sutton wrote on a whiteboard—actually shows up in dopamine signals. That's not just a coincidence; it's a map. And speaking of maps, if we really want to settle this debate, we're going to need to see exactly how everything is wired together...

Previously, we questioned if our biological hardware is a bottleneck. But to truly know, we need more than a hunch—we need the blueprint.

The Quest for the Connectome

If we had a perfect representation of the brain, why would it actually matter? It’s about moving from "black box" intuition to a language of architectures and learning rules.

"I feel like we don't really have an explanation of why LLMs are intelligent... we built them, we didn't interpret them. I want to describe the brain in that same language of architectures and hyperparameters."

Forget the "Golden Gate Bridge" Neuron

There is this obsession in interpretability research with finding the specific "circuit"—the exact cluster of neurons that encodes a single concept. But I think that’s a trap. If you train a neural network to predict stock prices or compute the digits of pi, it’s going to be doing incredibly complex computations internally that we might never fully "map" in the traditional sense.

Instead, what the Connectome gives us is a set of constraints. We don't need to know how the brain computes a specific bridge; we need to know if it's an energy-based model, a VAE, or something doing backprop. Is the wiring between the prefrontal cortex and the auditory cortex the same as the visual cortex?

"The problem is learning the basics by bespoke experiments takes an eternity. Getting a connectome is just... way more efficient."

Surprising Fact

There are more cell types in the hypothalamus than in the entire cortex.

The Cost of a Mouse Brain (Projected)

Scaling the tech: From billions to tens of millions through optical parallelization.

The Genome Analogy

The Human Genome Project cost $3 billion. Then, George Church and others changed the paradigm—moving from macro chemistry to parallelized microscopy—dropping the cost a million-fold. We're doing the same for the brain.

Optical vs. Electron

Electron microscopes slice tissue thin but lose molecular detail. Optical connectomics (E11's bet) uses photons to look at "fragile, gentle molecules"—giving us a molecularly annotated map, not just a physical one.

The "Practicality" Horizon

2027

The "Short Timeline" scenario. Connectomics might not be relevant yet; we're still riding the LLM wave.

5-10 Years

The transformative window. Transitioning from LLMs to brain-like, model-based RL architectures.

10+ Years

Complete "Brain Distillation"—using neural patterns as auxiliary loss functions to sculpt AI behavior.

Distill the Brain.

What if AI training wasn't just "cat vs. dog"? What if we added an auxiliary loss function that forced the AI to represent that cat the same way your visual cortex does? We're talking about brain-data augmented intelligence.

The Ultimate Regularization

If we can map the brain to master AI, what happens when we turn that power toward the abstract?
Next: What value will automating math have?

We’ve been mapping the physical brain to understand its constraints. But there is another landscape we are starting to automate: the abstract, rigid world of mathematics. If the brain is the hardware of intelligence, math is its most verifiable software.

The Lean Revolution

Lean is a programming language that forces you to express math proofs in a way a computer can understand. It’s no longer about "trusting" a mathematician's pen-and-paper scribbles; it’s about a machine clicking Verify and knowing, with 100% certainty, that your conclusion follows from your assumptions.

The Perfect Feedback Loop

Why does this matter for AI? Because it creates a perfect Reinforcement Learning (RL) signal. In the same way AlphaGo could play itself to become the best Go player in the world, an AI can now "play" math.

If a proof is mechanically verifiable, the AI knows exactly when it has succeeded. We are going to "RL the crap" out of math proving. It’s the transition from messy, probabilistic guesses to rigid, undeniable logic.

The "Moral Complexity" Loss Function

Can we automate creativity? Maybe. A "good" math conjecture is one that compresses information—it’s a powerful explanation that makes dozens of other theorems easier to prove. We can actually start to measure this "explanatory power."

CONJECTURE vs. PROOF

Proof is mechanical. Conjecturing is conceptual organization. We are shifting the human burden away from validating lemmas toward high-level strategy.

CYBERSECURITY UPSIDE

If you can prove the Riemann Hypothesis, you can prove a piece of software is unhackable. Provable, stable, secure software is the ultimate defense against AI-driven hacking.

"Quantity has a quality all of its own. We are moving toward automated cleverness."

Vibe Coding

Are we losing the "grounded intuition" of mechanics? If you never learn assembly, do you really understand the machine? Or does the faster feedback loop make you a more powerful architect?

The Outsider Physicist

Just as Steve Byrnes synthesizes neuroscience without being in a lab, we might see "outsider string theorists." If the machine handles the math, the barrier to entry for brilliant ideas drops to zero.

The Future AI Civilization

Imagine a world where AGI is still 10 years away, but we have billions of "automated cleverness" instances running. How do they collaborate? They can't just share "neuron activations"—that's a black box.

The only way a future AI civilization can scale is through a universal, provable language. If every step of an argument is mechanically verifiable, the "Jupiter Brains" can build on each other's work without fear of exploitation or social influence.

We might be moving back to symbolic methods, not because neural nets failed, but because we finally have enough "cleverness" to make symbolism work at scale. We are building the safeguarded world models of the future, specified in equations, not just weights.

As Terry Tao suggests, we aren't just proving one theorem at a time anymore. We are studying the landscape of *all* possible proved theorems—the aggregate set of what is knowable.

Up Next

If math is the software of the universe, what is the specific architecture of the biological machine that first discovered it?

Next: Architecture of the brain →

The Meatware Architecture

We just talked about automating mathematics, but what’s the substrate doing? If we’re building world models in silicon, we have to ask: how does the biological original actually represent reality?

Is the brain a "Symbolic Language" or just a hidden state?

When we talk about symbolic representation, I’m not just asking about function. I want to know if the brain holds something analogous to a neural network's hidden state, or if it’s closer to a formal language. The truth is, we don’t really know. We see "face patch" neurons that handle geometry in vision, and we see "place cells" in a rodent’s hippocampus creating spatial maps.

"My hunch? It’s going to be a huge mess. I don’t expect it to be pretty in there. It’s likely not a symbolic language, but a chaotic intersection of architectures, loss functions, and learning rules."

"It might even involve new physics."

— On the mystery of conscious experience

The Continual Learning Problem

In backprop, we freeze the weights. In the brain, the hippocampus is constantly "replaying" memories to the cortex—a living system consolidation. It’s a multi-timescale plasticity that we haven't quite cracked in AI yet.

Fast Weights

Is there a biological KV Cache? We have weights and activations, but the way the thalamus gates information suggests a level of "attention" that might make Transformers look simple.

Mapping the Biological Gap

The "Gap Map" & Mini Hubbles

We've been incubating Focused Research Organizations (FROs)—non-profit moonshots for science. When you talk to scientists, they don't just need "more research." They need infrastructure.

I call these "Mini Hubble Space Telescopes." They aren't discoveries themselves, but engineering feats that lift all boats. We've mapped out a few hundred of these fundamental capabilities—from connectomics to math-proving infrastructure.

Visualizing Scientific Infrastructure Gaps

DWARKESH

I thought mathematicians just needed whiteboards?

ADAM

I did too! But it turns out even math needs scale. They need Lean, they need verifiable programming languages. We need scale in every domain of science now.

Scale is the missing ingredient, from the neurons in our heads to the proofs on our screens.

Final Thoughts

CONCLUSION & WAYS TO FOLLOW →

Insight

Insight

The Quadrillion-Dollar Question

Evolution as a Python Script

The Cortex

The Steering Subsystem

The AlphaZero Efficiency Paradox

The Amortization Shift

The Genome Bottleneck

The Diversity of the "Steering" Brain

The Evolutionary Pivot

"We didn't invent a better brain, we just found a better incentive to grow it."

The "Dumb" State of RL

"NEUROSCIENCE SHOULD BE THE GROUND TRUTH."

The Dual-System Brain

1. The Basal Ganglia (Model-Free)

2. The Cortex (Model-Based)

Dopamine & Prediction Error

Culture as a Model-Free Algorithm

Training Real-World Agents

The Hardware Paradox

The Energy Budget

The Copy Problem

Cognitive Dexterity

The "Kludge" vs. The Ghost in the Machine

The Quest for the Connectome

Forget the "Golden Gate Bridge" Neuron

The Cost of a Mouse Brain (Projected)

The Genome Analogy

Optical vs. Electron

The "Practicality" Horizon

Distill the Brain.

The Lean Revolution

The Perfect Feedback Loop

The "Moral Complexity" Loss Function

Vibe Coding

The Outsider Physicist

The Future AI Civilization

If math is the software of the universe, what is the specific architecture of the biological machine that first discovered it?

The Meatware Architecture

Is the brain a "Symbolic Language" or just a hidden state?

The Continual Learning Problem

Fast Weights

The "Gap Map" & Mini Hubbles

Related Episodes

An audio version of my blog post, Thoughts on AI progress (Dec 2025)

Sarah Paine – Why Russia Lost the Cold War