[State of AI Startups] Memory/Learning, RL Envs & DBT-Fivetran — Sarah Catanzaro, Amplify
The Symbiosis
"I’ve always oscillated between data and AI... it's almost hard to divorce them."
From Symbolic Systems to SQL Queries
Sarah Catanzaro isn't a newcomer to the AI hype cycle. Her journey began in symbolic systems—what we used to call "AI" before the neural network renaissance. For Sarah, the move to data wasn't a career pivot; it was a quest for visibility.
She entered the world of data infrastructure simply because she wanted to understand exactly what happened under the hood of a SQL query. That curiosity led her to back some of the most influential companies in the modern data stack, including DBT.
The $600M Hurdle
The "Modern Data Stack" isn't dead—it's just growing up.
When news broke of the DBT and Fivetran merger, the industry vultures began circling, declaring the "End of the Modern Data Stack." Sarah’s take? Fundamentally wrong.
This wasn't a fire sale; it was a strategic consolidation forced by a shifting IPO landscape. In the previous era, $100M in revenue was the gold standard for going public. Today? The bar has moved to a staggering $600M.
Revenue Targets: Evolution of an Exit
The AI Lab Stack
"One of the things that has actually pleasantly surprised me... many of the big frontier labs are actually using both DBT and Fivetran. I talked to folks at Thinking Machines... DBT was already an important part of their stack."
The Nature of the Workload
The shift from traditional BI to AI training data changes the very physics of data engineering. In the old world, workloads were predictable. They were driven by deterministic systems—dashboards that hit a database at the same time every morning.
"With analyzing, curating, preparing datasets [for AI], it’s a bit more ad hoc. It’s less predictable. It changes how we think about things like learned optimizers and database infrastructure."
Sarah argues that while the demand for analytics engineers might not have exploded in headcount, the utility of the tools has never been higher. We are democratizing data by reducing the need for massive armies of people to manage it.
THE GUT PUNCH
A Failed Category?
The Admission
"That was something I got wrong. I really believed that data catalogs were going to become an important part of the modern data stack."
Why it failed humans
Snowflake and DBT built "good enough" features. For a human analyst, the catalog built into the warehouse was sufficient.
The Missed Opportunity
We built catalogs for people when we should have built them for machines (metadata services) and governance.
"We built data catalogs for the wrong people, and potentially for the wrong use cases."
The shift from discoverability to governance marks the next frontier for metadata.
The AI Lab Stack
Moving beyond the Modern Data Stack, we peek inside the labs where the GPU is king and "idle time" is the ultimate sin.
The GPU Bottleneck
"If you're unable to load data to a GPU efficiently, then the GPU is going to sit idle... and that’s a massive cost."
Scaling Elegance
Surprise: Existing infra scales. OpenAI didn't build a new transactional engine; they just used Rockset.
Infrastructure Efficiency Comparison
"A $100M Seed Round."
No roadmap. Just a long-term vision and a 7-day decision window.
It definitely makes me anxious. When founders ask how much to raise, I ask for milestones. But now? I talk to companies building "Frontier Labs for X" and they might build a consumer app... or maybe not. They don't know.
They're viewing it as transactional. They don't care about the partner; they care about the most money at the highest valuation. It's a signal for hiring, nothing more.
Exactly. You have seven days to get to know someone raising a billion dollars. How do you gain conviction? You don't. You're betting on people you barely know because the market is moving too fast to blink.
The Valuation is a Made-Up Number
Until a company exits, valuation is a ghost. It’s an agreement between two parties to pretend a thing is worth billions. I could say this podcast is worth $5 billion, and if we both agree, on paper, it is. But it’s not transacted at volume. It’s not real.
The danger lies in the liquidation preference. If these teams raise $100M and get acquired for $90M, the employees—who joined for the prestige and the "guaranteed" equity value—get zero.
"Joining a company because they have a billion-dollar valuation is just not the right way to choose a job. Do you deeply believe in the vision? That's the only question that matters."
Coming Up Next
World Models: Hype, Confusion, and Market Potential →The World Model Mirage
While the 2024 funding frenzy suggests total confidence, the research community is nursing a hangover of skepticism. Are we building actual understanding, or just better video editors?
"Every NeurIPS, I go to this group of researchers and we take a vote. Everyone is extremely skeptical about world models. It’s a trailing indicator—LLMs have been so successful that people think, 'We don’t need anything else.'"
"My take? We haven’t even defined what a 'world model' is. There are three competing definitions right now. We see market potential in video editing or autonomous driving, but a model for a video game doesn't just generalize to a factory floor. Not yet, anyway."
The Generalization Gap
The skepticism isn't about utility—it's about transferability. A world model trained on pixel-perfect environments (like gaming) currently lacks the "physics intuition" required for high-stakes robotics or industrial settings. We are in the era of niche world models, waiting for a "GPT-3 moment" for physical reality.
"World models for video games might not generalize to factory settings. That is the gnarly research problem of 2025."
The "Cursor Rules" Problem: Why AI Needs a Long-Term Memory
The "magic" of AI is wearing off. We’ve reached a point where basic completion isn't enough to keep users around. Speaker 1 points to a brutal reality for AI startups: Retention is low and churn is high.
The Retention Crisis
Users flock to new tools like Cursor, but switch the moment Windsurf or Cloud Code releases a flashier feature. Without personalization, your product is just a commodity wrapper.
The Static Limit
Current AI is static. Human intelligence updates constantly. If your model doesn't "update its weights" or learn your specific coding quirks, it’s just a very fast parrot.
"Cursor rules aren't enough," Speaker 0 argues. "It’s the shittiest form of memory."
We are entering the Consumerization of AI—a mirrors of the enterprise SaaS trend from a decade ago. It’s no longer about whether the model works; it's about whether the model *knows you*. This introduces a "stateful" problem. Today, inference is stateless. To make AI personal, we have to solve the "gnarly" infrastructure problem of loading, unloading, and caching stateful weights for millions of individual users.
Human Intelligence is Dynamic.
AI is Static.
The K-Factor
Founders are finally waking up to traditional SaaS metrics. Growth doesn't just "show up" because your model is smart anymore. You need a k-factor built on habit, and habit requires memory.
The Statefulness Problem
If models update weights for you, weights become stateful. Loading/unloading these for real-time inference is the next great infrastructure gold rush.
Up Next
"One more thing. I think we have time for one more take on RL environments..."
— Transitioning from Memory to Context
If memory management is the "brain" of the next-gen AI agent, then the environment is the world it inhabits. But is that world real, or just a multi-million dollar mirage?
The Environment Mirage
Labs are pouring 8-figure sums into synthetic RL environments. Are they building the future, or just overpriced DoorDash clones?
Is it just a Docker container with some custom software loaded? What makes a good one?
"I’m actually okay to be wrong, but I think RL environments are just a fad."
Wait, they're all fake? Labs are paying 7 to 8 figures for these! They could build them in-house, but they don't. Why?
"They were paying 8 figures for piss-poor data annotation, too. Labs have a lot of money. Why buy a DoorDash clone when you can use real logs from DoorDash itself?"
"The best RL environment
is the real world."
— The Cursor Approach
The Perfect AI Startup: Research Meets Application
The most exciting startups aren't just building "wrappers"; they are the ones where a hard research problem serves as the primary unlock for a massive application. Sarah points to a specific archetype: the company that hires researchers not for vanity, but to solve the technical bottlenecks that prevent a product from being "magic."
Take Harvey in the legal space. Their success isn't just marketing; it’s rooted in high-end RAG implementations that legal work demands. Or Sierra, where solving the "rule-following" research problem is the only way to build customer support agents that don't go rogue.
The Research-Value Flywheel
Visualizing how solving "Hard Research" (Memory, Rule Following) unlocks "Market Application" (Legal, Support).
Finding Sarah
Investor, researcher-whisperer, and tech realist. Look for the one-eyed dog in South Park or find her on the platform formerly known as Twitter.
— End of Chapter —
![[State of Post-Training] From GPT-4.1 to 5.1: RLVR, Agent & Token Efficiency — Josh McGrath, OpenAI](/_next/image?url=https%3A%2F%2Fassets.flightcast.com%2FV2Uploads%2Fnvaja2542wefzb8rjg5f519m%2F01K4D8FB4MNA071BM5ZDSMH34N%2Fsquare.jpg&w=3840&q=75)