By Mingrong & AI Pal
June 4, 2025
Edited on November 30 2025
Summary: This article addresses AI Data Starvation, proposing a passenger/driver dual-mode structure for Robotaxis to efficiently solve the "last-mile" data gap. It also highlights widespread market opportunities in data collection and automation across sectors, translating into a positive, long-term investment forecast for related computational hardware.
1. Data Starvation
The Reality of Data Starvation Klarna’s decision earlier this year to rehire human agents, after attempting to fully replace them with AI, made headlines. It wasn’t a triumph for automation, but a reality check. Many were puzzled: If AI can write code, pass bar exams, and assist scientists, why is "simple" customer service its stumbling block?
The culprit is likely "Data Starvation." There is a fundamental mismatch between the data AI consumes during training and the messy, emotional reality of human service. AI learns what it sees, and unfortunately, it hasn't seen enough real-world customer service.
LLMs like ChatGPT are trained on books, academic papers, and the open web—material that is structured, logical, and public. This makes AI an excellent assistant for scientists or programmers. However, customer service is different. It is fragmented, emotional, and highly private. Regulations like GDPR lock away these sensitive interactions, leaving AI with a blind spot. We cannot expect AI to perform magic in customer service when it is starved of the nuanced examples it needs to learn empathy and conflict resolution.
2. The "Taxi Scenario" Gap
This phenomenon of data starvation extends to the streets. While Waymo relies on pre-mapped 3D cities, Tesla’s FSD has devoured billions of miles of driving data from private owners. However, a gap remains between "driving" and "taxi service."
Private owners typically drive point-to-point (home to work). They don't generate the specific data needed for a taxi business—such as navigating hotel pick-up zones, managing curb-side luggage loading, or identifying the exact safe spot at a chaotic airport terminal. While companies currently use remote human supervision to handle these edge cases, scaling real-time human intervention is costly.
To close this gap efficiently, we may need a hybrid approach: granting approved Auto-Taxi passengers the option to take the wheel. This would require adjusting insurance policies to cover the dual identity of the driver/passenger, alongside providing incentives or ride credits. This structure encourages passengers to navigate the complex "last mile." Many enthusiasts still enjoy the tactile experience of driving, and their intervention provides critical, high-quality data on commercial driving behaviors that private car data simply cannot offer.
3. More Data, More Chips
Klarna’s struggle wasn't because the AI wasn't smart; it was because the AI was starved of context. The same applies to autonomous driving. Whether the data comes from remote human supervisors or passengers taking the wheel, the volume of new, complex data needed to solve these "edge cases" is massive.
This data challenge extends to virtually every sector not yet fully automated. For instance, the costly and highly skilled manual grading and sorting of high-value crops like ginseng is pushing farm owners to seek automation. This requires AI to be trained on millions of high-resolution images to accurately classify complex root structures, a task currently reliant on human labour.
The logic for the industry is clear: Automation success depends on assembling the right training data first. For investors, this highlights a critical downstream opportunity. Closing these data gaps doesn't mean less technology; it means more. Collecting, storing, and training on this new wave of real-world interaction data—from customer service chats to driving footage and agricultural imagery—will require massive computational power, signaling a continued, long-term surge in demand for high-performance chips and data storage.