By Mingrong & LLM
January 13, 2025 | Original Content
In recent discussions about artificial intelligence, a particular comparison has gained popularity. It is often presented as a piece of common sense:
“A teenager can learn to drive in ten hours, while an AI system requires billions of miles of training data to attempt the same task.”
This comparison appeals strongly to intuition and flatters human intelligence. However, the statement is neither accurate nor objective. The first and most fundamental problem lies in a false assumption: that the teenager is “starting from zero.”
By the time a young person sits behind the wheel, they have already accumulated more than a decade of lived experience. They possess an internalized understanding of physical space, motion, cause and effect, social norms, and risk. Long before any formal driving instruction begins, they have observed traffic behavior, anticipated danger, and formed a mental model of how the world works. On top of this, they benefit from evolutionary priors—biological structures shaped by millions of years of adaptation.
From the human side, learning to drive is therefore not the construction of a new world model. It is the alignment of an already mature world model with a new control interface.
By contrast, when an AI system begins to learn to drive, it truly starts from scratch. It “opens its eyes” for the first time to observe the road. It does not possess the innate capabilities of the human visual system—such as built-in edge detection, motion tracking, or depth perception. It has not spent years living in a city or passively absorbing traffic patterns. Under these conditions, comparing a teenager to an AI learner is not insightful—it is misleading.
There is a second asymmetry that further undermines this comparison: the asymmetry of expectations. What does it actually mean when we say “a teenager can learn to drive in ten hours”? It simply means they can pass a road test. It carries no judgment about their performance in the future. Human learners are evaluated under finite and forgiving conditions, where mistakes are expected and socially absorbed.
Artificial intelligence systems, however, are judged against a near-infinite standard. They are expected to operate continuously across all edge cases with near-zero tolerance for failure. To equate these two standards—finite-time adequacy for humans and near-infinite reliability for machines—is conceptually incoherent.
This pattern of distorted comparison extends to Large Language Models (LLMs). An LLM is not embodied intelligence; it operates in the symbolic domain. Language is the most condensed and transmissible form of human knowledge, and LLMs represent history’s most advanced reconstruction of this linguistic heritage. To dismiss LLMs for lacking “embodied intelligence” or a “world model”—sometimes even ranking them below animals—is more than a technical mistake. It reflects a focus-driven bias and a lack of objectivity, a blind dismissal of the monumental achievement that LLMs represent. LLMs have profoundly transformed how people access and interact with knowledge—an impact arguably greater than the invention of paper and the establishment of university systems.
By the way, LLM wrote almost every sentence in this article.
Finally, consider the reality of safety. AI drivers do not need to achieve the mythical standard of “zero accidents” to be viable; they only need to perform significantly better than humans on average. When AI drivers’ accidents differ from those of humans, we must make comparisons fairly, objectively and constructively, avoid misleading or unfair contrasts, and stop chasing impossible perfection. Progress should be measured by what truly matters: the substantial reduction of lives lost on the road.