Dwarkesh Podcast

Adam Marblestone – AI is missing something fundamental about the brain

12/30/2025

The Quadrillion-Dollar Question

Why do LLMs need oceans of data to achieve a fraction of human capability? We’ve obsessed over architecture, but we might be looking at the wrong map. The secret isn't the wires—it's the reward.

Evolution as a Python Script

Machine learning loves mathematically elegant loss functions. Predict the next token. Minimize cross-entropy. It’s clean, it's simple, and it's probably not how we work.

My hunch? Evolution built immense complexity into our loss functions, not just our architecture. Imagine a massive, ancient codebase—thousands of lines of "Python code" generating a specific curriculum for every stage of development. Evolution has seen what works for millions of years; it doesn't leave the learning to chance. It encodes the knowledge of the curriculum itself.

The World Model

The Cortex

An omnidirectional prediction engine. Unlike an LLM that only looks forward, the cortex is natively designed to fill in any blank. It predicts vision from audition, muscle tension from abstract thoughts. It’s the ultimate association machine.

The Evolutionary Hard-Drive

The Steering Subsystem

The "Lizard Brain." It has its own primitive sensors (like the superior colliculus) to detect faces or movement instantly, triggering reflexes like shame or fear before the cortex even knows why.

"How does the brain encode high-level desires? Evolution never saw Jan LeCun or a podcast. How does it know to make me feel 'shame' if I misinterpret his energy-based models?"

"It’s about Predictive Steering. The cortex learns to predict when the 'Lizard Brain' is about to flinch. When you hear 'there is a spider on your back,' your cortex generalizes the concept and triggers the subcortical alarm. You've wired an abstract word to an ancient reflex."

"We aren't just predicting the next token; we are predicting our own biological reactions to the world."

The AlphaZero Efficiency Paradox

I ran a "Vibe Coding" experiment with Gemini. We tested a radical idea: Is it better to train one massive agent, or a diverse population of "smaller" agents with the same total budget?

"The best agent in a population of 16—having received only 1/16th of the compute—outperformed the single agent hogging the entire budget."

Data represents relative win-rate in multi-agent self-play environments.

Current: Reward Functions
Next: Amortized Inference & The Genome

The Amortization Shift

If the brain's secret sauce is its reward function, how does it actually execute intelligence in real-time? We're moving from slow, Bayesian "sampling" to the lightning-fast "amortized inference" that defines both modern AI and biological perception.

D

Right now, models map input to output. But real intelligence is a prior over how the world could be. To calculate every possible cause is computationally intractable. You’d have to sample forever. Is "amortized inference" just skipping the sampling?

G

Exactly. Bayesian inference is perception’s biggest headache. Neural networks don't start from scratch every time; they bake that "cause-to-observation" logic directly into the feed-forward pass. You don't sample; you just know.

The Genome Bottleneck

There is a massive mystery in biology: if you want to analogize evolution to pre-training, how do you explain the fact that so little information is conveyed through the genome? We’re talking about **three gigabytes**. That’s the total size of the human genome. A tiny fraction of that codes for the brain.

If you were trying to "hard-code" the weights of a trillion-parameter model into 3GB of disk space, you’d fail instantly. So, what is evolution actually storing?

The answer: **The Loss Function.**

In Python, a reward function is literally a line of code. You can have a thousand lines specifying "spider-flinch" or "social-bonding," and it takes up almost no space. Evolution didn't find the weights; it found the hyperparameters and the reward signals that force the brain to learn those weights during a lifetime.

Data Point

3.2 GB

Total Human Genome Size

The Insight

"The reward function is compact. The learning subsystem is a generic eight-layer transformer replicated a million times."

The Diversity of the "Steering" Brain

New single-cell atlases reveal a stark contrast: the cortex (learning) is repetitive and uniform, while the steering regions (reward) are a diverse zoo of bespoke cell types.

In the Learning Subsystem, you're just replicating layers. The "Python code" to make an 8-layer transformer isn't much longer than a 3-layer one. It's scalable and repetitive.

But in the Steering Subsystem, there's a "gazillion" weird cell types. One for the spider-flinch, one for the taste of salt, one for maternal bonding. These are innately wired circuits. They don't learn; they direct learning.

"The cortex doesn't know about spiders. It just knows about layers and gradients. The steering subsystem is where all the bespoke species-specific 'crap' lives."

The Evolutionary Pivot

"We didn't invent a better brain, we just found a better incentive to grow it."

Why did the hominid brain explode in size? It wasn't a breakthrough in cortical architecture. A mouse's cortex and a human's cortex are remarkably similar.

The unlock was **Social Learning**. Evolution tweaked the reward function to value joint eye contact, linguistic cues, and elder-imitation. This increased the "returns" on having a bigger cortex. Once the reward function prioritized social data, the scaling laws took over.

Next: We dive into how this feedback loop differentiates Model-based vs Model-free RL in the human mind.

Previously: Amortized inference and genomic storage

The "Dumb" State of RL

Why current Large Language Models are using the most primitive forms of learning—and still somehow winning.

"It’s kind of crazy that this is working."

Right now, when we train LLMs, if they solve a math problem or pass a unit test, we just up-weight the entire trajectory of tokens. It’s brute force. Even Ilya Sutskever pointed this out on the podcast—it’s weird that we don’t use Value Functions.

Think back ten years to the Atari-playing AI. That was using Q-learning. It had a sense of the long-run consequence of an action. Modern LLMs? They’re optimized for GPUs, not for the conceptual elegance of Reinforcement Learning. We're using the "dumbest" form of RL, yet we’re seeing incredible results.

"NEUROSCIENCE SHOULD BE THE GROUND TRUTH."

The Dual-System Brain

1. The Basal Ganglia (Model-Free)

This is the "dumb" RL in our heads. It has a finite, small action space. It tells the spinal cord: Do this motor action. Yes or no? It's simple, naive, and incredibly fast.

2. The Cortex (Model-Based)

This is the high-order stuff. It builds a world model. It doesn't just react; it predicts. It asks: What types of plans lead to reward in these specific circumstances? It's RL as inference—clamping the "High Reward" variable and sampling the plan that gets us there.

Dopamine & Prediction Error

Neuroscience shows dopamine isn't just "reward"—it's a Reward Prediction Error (RPE) signal. It's the gap between expectation and reality.

Tangential Thought

Culture as a Model-Free Algorithm

Think about Joe Henrich's work on cultural evolution. How does a society figure out that a specific bean is poisonous unless you perform a 10-step cleaning process? No single person sat down and "modeled" the chemical toxicity.

"Culture is like model-free RL happening at a civilizational level. Evolution is the simplest algorithm, and if we believe all of this—us—came from evolution, then simple algorithms can get you anything if you run them long enough."

We have this hierarchy of "Model-Free" vs "Model-Based" systems stacked on top of each other:

Evolution: Model-Free (The Outer Loop)
Basal Ganglia: Model-Free (Motor/Habit)
Cortex: Model-Based (World Modeling)
Culture: Model-Free (Generational Knowledge)
Partner Note

Training Real-World Agents

Speaking of model-free culture and tacit knowledge—some things you just can't learn from a manual. LabelBox provides the expertise and scaffolding to capture that "underwriter’s intuition" for your AI agents.

Learn more at labelbox.com/thwarkash
Next:
Is biological hardware a limitation or an advantage? →

The Hardware Paradox

We've spent a lot of time talking about the software of the mind—model-based vs. model-free RL. But what happens when that algorithm is inscribed in meat instead of silicon? Are we smarter because of our biological limitations, or in spite of them?

The Energy Budget

20 Watts.

The brain runs on the power of a dim lightbulb at 200 Hz. To survive, it evolved extreme "unstructured sparsity" and co-located memory and compute.

The Copy Problem

Immutable.

You can't "read-write" a neuron. I can't copy my weight matrix into your head. This lack of random access is a massive biological "fuck you" to scalability.

Cognitive Dexterity

Sampling.

Neurons are naturally stochastic. While Python needs a random number generator, the brain just "is" probabilistic. It's built for inference.

The "Kludge" vs. The Ghost in the Machine

Is the cellular machinery of the brain—all those genetic changes and molecular machines—actually doing algorithmic work, or is it just a messy way to implement weights?

Think about it: In a digital mind, you nudge a parameter. Easy. In a cell, to modulate a synapse according to a gradient signal, you have to talk to the nucleus, send signals back out... it's a massive logistics operation. I tend to think most of that cellular "crazy machinery" is just the infrastructure needed to make the synaptic learning work without a central controller.

But there are exceptions. Look at the cerebellum. It’s incredible at timing—like predicting exactly when a puff of air will hit your eye after a flash of light. It turns out the cell body itself might be storing those time constants. It’s not just a ring of synapses; the hardware is the clock.

"The best computational neuroscience theories we have were invented as AI models first."

— On the irony of reverse-engineering the brain

The AI Perspective

"Does a paperclip maximizer need a social brain? Can you have AGI without the human 'steering subsystem'?"

The Reality Check

"We already know from LLMs that you can learn language without eye contact. But to build spaceships? You need curiosity and exploration. Those are the 'drives' we need to align."

Some neuroscientists, like Yuri Buzaki, think we’re full of it. They argue that our AI vocabulary—"backprop," "weights," "layers"—is just a made-up language we're forcing onto the brain. They want a bottom-up vocabulary based on physical dynamical systems and oscillations.

I say: why not both? We should simulate the zebrafish from the bottom up, but we shouldn't ignore the fact that TD learning—an equation Sutton wrote on a whiteboard—actually shows up in dopamine signals. That's not just a coincidence; it's a map. And speaking of maps, if we really want to settle this debate, we're going to need to see exactly how everything is wired together...

Previously, we questioned if our biological hardware is a bottleneck. But to truly know, we need more than a hunch—we need the blueprint.

The Quest for the Connectome

If we had a perfect representation of the brain, why would it actually matter? It’s about moving from "black box" intuition to a language of architectures and learning rules.

"I feel like we don't really have an explanation of why LLMs are intelligent... we built them, we didn't interpret them. I want to describe the brain in that same language of architectures and hyperparameters."

Forget the "Golden Gate Bridge" Neuron

There is this obsession in interpretability research with finding the specific "circuit"—the exact cluster of neurons that encodes a single concept. But I think that’s a trap. If you train a neural network to predict stock prices or compute the digits of pi, it’s going to be doing incredibly complex computations internally that we might never fully "map" in the traditional sense.

Instead, what the Connectome gives us is a set of constraints. We don't need to know how the brain computes a specific bridge; we need to know if it's an energy-based model, a VAE, or something doing backprop. Is the wiring between the prefrontal cortex and the auditory cortex the same as the visual cortex?

"The problem is learning the basics by bespoke experiments takes an eternity. Getting a connectome is just... way more efficient."

Surprising Fact

There are more cell types in the hypothalamus than in the entire cortex.

The Cost of a Mouse Brain (Projected)

Scaling the tech: From billions to tens of millions through optical parallelization.

The Genome Analogy

The Human Genome Project cost $3 billion. Then, George Church and others changed the paradigm—moving from macro chemistry to parallelized microscopy—dropping the cost a million-fold. We're doing the same for the brain.

Optical vs. Electron

Electron microscopes slice tissue thin but lose molecular detail. Optical connectomics (E11's bet) uses photons to look at "fragile, gentle molecules"—giving us a molecularly annotated map, not just a physical one.

The "Practicality" Horizon

2027

The "Short Timeline" scenario. Connectomics might not be relevant yet; we're still riding the LLM wave.

5-10 Years

The transformative window. Transitioning from LLMs to brain-like, model-based RL architectures.

10+ Years

Complete "Brain Distillation"—using neural patterns as auxiliary loss functions to sculpt AI behavior.

Distill the Brain.

What if AI training wasn't just "cat vs. dog"? What if we added an auxiliary loss function that forced the AI to represent that cat the same way your visual cortex does? We're talking about brain-data augmented intelligence.

The Ultimate Regularization
If we can map the brain to master AI, what happens when we turn that power toward the abstract?
Next: What value will automating math have?
We’ve been mapping the physical brain to understand its constraints. But there is another landscape we are starting to automate: the abstract, rigid world of mathematics. If the brain is the hardware of intelligence, math is its most verifiable software.

The Lean Revolution

Lean is a programming language that forces you to express math proofs in a way a computer can understand. It’s no longer about "trusting" a mathematician's pen-and-paper scribbles; it’s about a machine clicking Verify and knowing, with 100% certainty, that your conclusion follows from your assumptions.

The Perfect Feedback Loop

Why does this matter for AI? Because it creates a perfect Reinforcement Learning (RL) signal. In the same way AlphaGo could play itself to become the best Go player in the world, an AI can now "play" math.

If a proof is mechanically verifiable, the AI knows exactly when it has succeeded. We are going to "RL the crap" out of math proving. It’s the transition from messy, probabilistic guesses to rigid, undeniable logic.

The "Moral Complexity" Loss Function

Can we automate creativity? Maybe. A "good" math conjecture is one that compresses information—it’s a powerful explanation that makes dozens of other theorems easier to prove. We can actually start to measure this "explanatory power."

CONJECTURE vs. PROOF

Proof is mechanical. Conjecturing is conceptual organization. We are shifting the human burden away from validating lemmas toward high-level strategy.

CYBERSECURITY UPSIDE

If you can prove the Riemann Hypothesis, you can prove a piece of software is unhackable. Provable, stable, secure software is the ultimate defense against AI-driven hacking.

"Quantity has a quality all of its own. We are moving toward automated cleverness."
Vibe Coding

Are we losing the "grounded intuition" of mechanics? If you never learn assembly, do you really understand the machine? Or does the faster feedback loop make you a more powerful architect?

The Outsider Physicist

Just as Steve Byrnes synthesizes neuroscience without being in a lab, we might see "outsider string theorists." If the machine handles the math, the barrier to entry for brilliant ideas drops to zero.

The Future AI Civilization

Imagine a world where AGI is still 10 years away, but we have billions of "automated cleverness" instances running. How do they collaborate? They can't just share "neuron activations"—that's a black box.

The only way a future AI civilization can scale is through a universal, provable language. If every step of an argument is mechanically verifiable, the "Jupiter Brains" can build on each other's work without fear of exploitation or social influence.

We might be moving back to symbolic methods, not because neural nets failed, but because we finally have enough "cleverness" to make symbolism work at scale. We are building the safeguarded world models of the future, specified in equations, not just weights.

As Terry Tao suggests, we aren't just proving one theorem at a time anymore. We are studying the landscape of *all* possible proved theorems—the aggregate set of what is knowable.

Up Next

If math is the software of the universe, what is the specific architecture of the biological machine that first discovered it?

Next: Architecture of the brain →

The Meatware Architecture

We just talked about automating mathematics, but what’s the substrate doing? If we’re building world models in silicon, we have to ask: how does the biological original actually represent reality?

Is the brain a "Symbolic Language" or just a hidden state?

When we talk about symbolic representation, I’m not just asking about function. I want to know if the brain holds something analogous to a neural network's hidden state, or if it’s closer to a formal language. The truth is, we don’t really know. We see "face patch" neurons that handle geometry in vision, and we see "place cells" in a rodent’s hippocampus creating spatial maps.

"My hunch? It’s going to be a huge mess. I don’t expect it to be pretty in there. It’s likely not a symbolic language, but a chaotic intersection of architectures, loss functions, and learning rules."

"It might even involve new physics."

— On the mystery of conscious experience

The Continual Learning Problem

In backprop, we freeze the weights. In the brain, the hippocampus is constantly "replaying" memories to the cortex—a living system consolidation. It’s a multi-timescale plasticity that we haven't quite cracked in AI yet.

Fast Weights

Is there a biological KV Cache? We have weights and activations, but the way the thalamus gates information suggests a level of "attention" that might make Transformers look simple.

Mapping the Biological Gap

The "Gap Map" & Mini Hubbles

We've been incubating Focused Research Organizations (FROs)—non-profit moonshots for science. When you talk to scientists, they don't just need "more research." They need infrastructure.

I call these "Mini Hubble Space Telescopes." They aren't discoveries themselves, but engineering feats that lift all boats. We've mapped out a few hundred of these fundamental capabilities—from connectomics to math-proving infrastructure.

Visualizing Scientific Infrastructure Gaps

DWARKESH

I thought mathematicians just needed whiteboards?

ADAM

I did too! But it turns out even math needs scale. They need Lean, they need verifiable programming languages. We need scale in every domain of science now.

Scale is the missing ingredient, from the neurons in our heads to the proofs on our screens.

Related Episodes