E216 | Investing in Humanoids: Research Papers, Reality Checks, and the Long Road to Commercialization
Embodied AI: Is it the eve of the GPT-3 explosion, or the eve of the bubble bursting?
“Everyone is waiting for the 'ChatGPT moment' in the robotics sector, but there is still a massive chasm between reality and valuations.”
Recently, the robotics field has indeed become 'ridiculously' hot. A few days ago, 1X released NEO, a humanoid robot that can enter the home. That polished promotional video was certainly stunning, but looking back, it's the exact same playbook as AI startups in 2023: appearing omnipotent, while in reality, it's just that one successful take filmed after countless failures in the lab.
The current status is that production plans for various companies start at 100,000 units per year, and Goldman Sachs even predicts that global humanoid robot shipments will reach staggering numbers by 2035.But the awkward part is that while real orders have yet to materialize, stock prices and valuations have already skyrocketed.
Industry Ambition: Global Humanoid Robot Shipment Forecast
Source: Goldman Sachs Equity Research (Simulated Estimate)
“I feel the current valuations truly don't match the state of the technology. In terms of Scale-up (scaling) and Generalization (generalization capability), there hasn't been a substantial leap in these two key metrics. But for VCs, this is about 'taking a position early,' laying the groundwork for a future that cannot be missed.”
“The explosion of the robotics track will happen in two stages. The first stage is the technical 'GPT moment,' where robots can directly understand language and visual instructions and decompose actions. But that’s only the first step; the true 'iPhone moment' will need to emerge from hardware scaling and data feedback loops.”
The BERT Era (2017-2018)
Algorithmic architectures were emerging and data was starting to pile up, but the awe-inspiring sense of 'emergent intelligence' hadn't happened yet. This is exactly where robotics stands today.
The GPT Moment
The hallmark is 'end-to-end' generalization. You tell it to 'go to the kitchen and get a glass of water,' and it handles the L0 to L3 logic automatically, rather than relying on hard-coded scripts.
Editor's Note
“So-called 'Embodied AI' is about giving AI a body. It needs more than just a brain; it must have the ability to perceive and act in the physical world.”
“The US is strong in software and large models,
but the iteration speed of Chinese hardware products,
can even reach three iterations a day.”
This tension between software and hardware, Silicon Valley and Shenzhen, forms the most interesting landscape of the robotics track. While Americans are in labs tweaking code and churning out papers, factories in Shenzhen have already begun a 'street fight' over supply chains and real-world scenario data. The similarities and differences in these strategic approaches are the core of what we will dig into next: Why, in the field of robotics, might the final winner be the one who understands 'integration' best?
The US-China Robotics War:
Strategy and Playbook Differences and Similarities
Just a moment ago, we were agonizing over whether GPT is a bubble, but turn your head to the robotics field and you'll find the smell of gunpowder is much more real here. If large models are ethereal inspiration, then robots are the bones and muscles that bring that inspiration to the ground.
Many people ask me, why are we still competing in this track when the US has Tesla Optimus and Figure AI? Actually, looking around, you'll see this isn't a competition on the same dimension at all. The American logic is “First Principles” plus “Brute-Force Aesthetics”: First set a grand vision, then rely on the world's top AI geniuses to calculate and stack, attempting to have algorithms directly take over the physical world.
But the domestic playbook is completely different. We are more like “growing out of the soil”. If you look at these robotics companies in Shenzhen, they don't even care if the first step is perfect; they care about whether it works. This difference is essentially a head-on clash between 'scientist culture' and 'engineer dividends.'
In Shenzhen, robots can iterate as many as three times a day
This is no exaggeration. In Nanshan or Longhua District, you can modify a part drawing in the morning, get the prototype from the factory next door at noon, install and test it in the afternoon, and have the code running by evening. This kind of hand-to-hand combat speed in the supply chain, is unimaginable in an environment like Silicon Valley where 'waiting two weeks for a custom part' is the norm.
Daily Hardware Iteration Limit
In contrast, overseas laboratory cycles are typically measured in 'weeks'
Competitiveness Radar Chart: Dimensions of the US-China Rivalry
Editor's Note: The Power of Data
You can understand from this chart that China's advantages are concentrated in scenario data and iteration speed. While the US is working on General Purpose models, China has already fixed tens of thousands of bugs in factories, logistics, and even home scenarios. This is called 'encircling the cities from the rural areas.'
The Great Investment Logic Debate: Invest in 'Embodied AI' or 'Advanced Manufacturing'?
Venture Capital A:
“We want to invest in the next 'OpenAI of the physical world'! If it's just about grinding down hardware costs, isn't that just a traditional OEM? A robot without a brain has no soul.”
Practical Investor B:
“Cut the nonsense. Whether large models can be implemented ultimately depends on whether a dexterous hand can pick up a grape. In China, embodied AI without discussing manufacturing capability is nothing but a castle in the air.”
“The so-called 'overtaking on a curve,'
is actually aboutthe dirtiest and most exhausting hardware scenarios,
racing to create the fastest data feedback loop.”
This is the core secret weapon for domestic robot commercialization. Compared to the strict privacy and access restrictions abroad, domestic factories, industrial parks, and even streets have a higher tolerance for real-world robot testing. This "tolerance" translates into extremely preciouslong-tail corner cases data.
Invest in the "upper body" first
or the"lower body"?
Now that we've finished talking about the "breadth" of domestic scenarios, we need to dive into the lab and talk about the "depth" of investment strategy.
"Lower Body": The Moat of Motion Control
The current consensus is: the "legs" of robots have hit a competitive plateau. With the support of the open-source community, balancing algorithms for quadrupeds and bipeds have moved from laboratories into factories. What investors look for in motion control now is"robustness"—whether it can walk on ice, and whether it can stay upright after being kicked.
The "Upper Body" is the real sea of stars
Dextrous hands (Dexterity) and brains (Foundation Models). If the lower body is about mobility, the upper body is about productivity. Right now, the money is all pouring into this area.
Editor's Note: Dextrous Hands
It’s more than just five fingers. It involves multi-sensor fusion, haptic feedback (Haptics), and object manipulation at millimeter-level precision.
"Withouthaptic data,
robots will never learn true dexterity."
— Rodney Brooks' Prediction
1X: A "Scam" or a "Trojan Horse"?
When it comes to 1X (the company OpenAI invested in), there's a lot of debate in the industry. Everyone sees videos of robots tidying up rooms with impossibly smooth movements, and the first reaction is always:This is definitely teleoperation!
Exactly, it is teleoperation. But that’s precisely what makes it clever. This is a "Trojan Horse" strategy.
So, the focus of the controversy isn't whether it's "real AI" right now, but whether this data collection model can work. If 1X can turn every delicate touch performed by a human operator into a training set, the "brain's" evolution will be exponential.
Hardcore Term: Teleoperation
It’s not just "remote-controlled cars." In embodied AI, this usually refers to humans taking over the robot's joints in real-time using VR headsets or motion capture suits. The goal is to solve the "cold start" problem for AI in complex tasks and generate high-quality training trajectory data (Data Collection).
Debates over technical paths are fascinating, but capital is, after all, profit-driven.
Next, we need to talk about the most realistic question:Once this thing is built, who is actually paying for it?
Real-world commercialization:
Who's paying?
Now that we've finished talking about 1X's love-it-or-hate-it "Trojan Horse," let's talk about reality: if humanoid robots can't wash my socks at home yet, then who is actually waving their checkbooks right now?
The "Extreme Involution" of Warehousing and Logistics
Forget about those fancy backflips. Current buyers are mostly looking at ROI (Return on Investment). In massive logistics sorting centers, robots don't need emotions; they just need to be 15% faster than humans at repetitive tasks and don't require social security payments. The current deployment is essentially"replacing expensive labor with depreciable assets".
Core Driver: Labor Shortage > Technical Passion
Hierarchy of Deployment Barriers
Hardware Industry Chain:
Will it become modularlike smartphones?
This is a highly controversial topic. Some think robots will follow in the footsteps of smartphones—the so-called "MediaTek model." You buy an off-the-shelf joint, someone else buys an off-the-shelf sensor, and everyone just pieces them together to ship a product.
But I have my reservations. The biggest difference between robots and phones is that a phone is an information interaction terminal, while a robot is a physical interaction terminal. On a phone, whether your screen is a bit larger or your battery a bit thicker doesn't affect the algorithm; however, on a robot, tiny tolerances in the reducer or the weight distribution of a joint directly determine whether your model can even run.
"The current situation is more like computers in the 70s. Every component requires extremely precise coupling rather than simple toy-like assembly. Hardware-software integration will remain the absolute moat for top-tier players in the next 3-5 years."
"Humanoid Robot Cost Structure Forecast
Editor's Note:
The current 'high cost' mainly stems from joint actuators. Once harmonic reducers and planetary roller screws can achieve large-scale domestic substitution, BOM costs are expected to drop by over 60% within 5 years.
Bold Prediction: The Robotics World in the Next 5 Years
Year One of 'Heading to the Factory to Tighten Screws'
Leading automakers begin small-scale robot deployment to handle automation at specific stations (such as door installation and quality inspection). This isn't about saving money; it's about accumulating real-world scenario data.
The Explosion of Generalist Models
Robots no longer need code written for every single movement. With just a few demonstrations, they can learn to handle unseen objects. The dexterity of humanoid robots begins to surpass that of specialized robotic arms.
An 'iPhone Moment' or just a 'Big Toy'?
The first batch of consumer-grade robots truly capable of handling housework enters wealthy households. They might not be as perfect as a butler yet, but being able to help you tidy up toys and clean hard-to-reach corners on the floor is enough to drive the market wild.
“Don’t overestimate the changes of the coming year,
but absolutely do not underestimate the next five years’great species explosion.”
— Final thoughts before the closing remarks
