在家中训练人形机器人的零工们

qimuai 发布于 2026-4-2 07:01 阅读：0 一手编译

在家中训练人形机器人的零工们

内容来源：https://www.technologyreview.com/2026/04/01/1134863/humanoid-data-training-gig-economy-2026-breakthrough-technology/

内容总结：

全球“数据民工”居家训练人形机器人：高薪背后的隐私与伦理挑战

在尼日利亚和印度的狭小公寓里，一批年轻的“数据记录员”正将iPhone绑在额前，以慢动作反复录制自己叠衣服、铺床、洗碗的日常家务。这些看似古怪的视频，正成为全球科技公司竞逐人形机器人浪潮中最炙手可热的训练素材。

随着特斯拉、Figure AI等企业加速研发用于工厂和家庭的人形机器人，如何让机器学会人类动作成为关键难题。美国数据公司Micro1等机构为此在全球50多个国家招募了数千名兼职工作者，通过录制海量生活视频，为机器人提供“行为教材”。这份工作按当地标准报酬优厚——例如在尼日利亚，时薪可达15美元，远高于当地平均收入，甚至成为部分地区的经济助推器。

然而，光鲜背后暗藏隐忧。尽管公司要求录制者避免露脸及泄露个人信息，但视频仍不可避免地捕捉到家庭内部环境、私人物品与生活动线。多位受访工人坦言，他们并不清楚自己的数据将被如何使用、存储或转售给哪些机器人公司。“如果工人参与其中，公司有责任告知他们技术的潜在用途及长期影响。”马里兰大学教授雅斯敏·科图里指出。

此外，数据质量与安全性也受到质疑。机器人专家提醒，家庭生活中的某些习惯可能不符合安全规范，若被机器人学习可能导致事故。尽管数据公司声称会筛除危险动作，但面对数万小时的视频洪流，人工审核难免存在疏漏。

目前，行业对数据的需求仍在疯狂增长。仅2025年，人形机器人领域融资已超60亿美元。为满足需求，中国等国已出现配备VR头盔和外骨骼的国有机器人训练中心，让工人直接“手把手”教导机器人完成开门、擦拭等动作。

对许多参与者而言，这份工作既是机遇也是负担。一名印度工学生为录制视频常错过晚餐，在堆满盆栽的阳台反复折叠同一件衣服；一位尼日利亚医学生则坦言每日数小时的熨衣工作“枯燥乏味”。但他们也承认，这种“为未来机器人奠基”的参与感，以及可观的收入，让一切显得与众不同。

在这场席卷全球的数据采集热潮中，技术突破、经济需求与个人隐私的博弈，才刚刚开始。

中文翻译：

在家训练人形机器人的零工劳动者
尼日利亚和印度的劳动者们将iPhone绑在头上，录制自己做家务的视频。

当在尼日利亚中部山城生活的医学生宙斯结束医院漫长的一天回到自己的单间公寓时，他会打开环形灯，将iPhone固定在额头上，开始录制视频。他像梦游者一样将双手举在身前，把床单铺到床上。他的动作缓慢而谨慎，以确保双手始终保持在镜头画面内。

宙斯是美国公司Micro1的数据采集员。这家总部位于加州帕洛阿尔托的公司收集现实世界数据，出售给机器人公司。随着特斯拉、Figure AI和Agility Robotics等公司竞相研发人形机器人——那些设计用于在工厂和家庭中模仿人类形态和动作的机器——像宙斯这样的零工拍摄的视频正成为训练机器人的最新热门方式。

Micro1已在包括印度、尼日利亚和阿根廷在内的50多个国家雇佣了数千名合同工，这些国家有大量精通技术的年轻人在寻找工作。他们将iPhone固定在头上，录制自己叠衣服、洗碗和做饭的视频。按当地标准，这份工作报酬优厚，并促进了当地经济，但也引发了关于隐私和知情同意的棘手问题。而且这份工作有时颇具挑战性——甚至有些怪异。

宙斯在去年11月找到了这份工作，当时领英和YouTube上到处都有人在谈论它。"这将是留下印记、提供未来用于训练机器人的数据的绝佳机会，"他想。

宙斯的时薪是15美元，这在失业率高企、经济紧张的尼日利亚是一笔不错的收入。但作为一个梦想成为医生的聪颖学生，他觉得每天花数小时熨烫衣服很无聊。

"我实在不太喜欢这份工作，"他说，"我是那种需要……动脑的技术型工作的人。"

宙斯以及所有接受《麻省理工科技评论》采访的工人都要求使用化名，因为他们未被授权谈论自己的工作。

众所周知，人形机器人很难制造，因为操控物理物体是一项难以掌握的技能。但支撑ChatGPT等聊天机器人的大语言模型的兴起，激发了机器人领域的范式转变。正如大语言模型通过从互联网抓取的海量文本数据训练学会了生成文字，许多研究人员认为，人形机器人可以通过海量运动数据训练学会与世界互动。

编者注：在最近的一项调查中，《麻省理工科技评论》的读者将人形机器人选为我们2026年"十大突破性技术"榜单的第11项突破。

然而，机器人技术需要关于物理世界复杂得多的数据，而这些数据更难获取。虚拟模拟可以训练机器人完成杂技动作，但无法教会它们如何抓取和移动物体，因为模拟难以完美精确地建模物理规律。要让机器人在工厂工作或担任管家，现实世界的数据——尽管收集起来耗时且昂贵——可能是我们所需要的。

投资者正狂热地投入资金以解决这一挑战，2025年对人形机器人的投资超过60亿美元。家庭数据采集正在全球范围内成为蓬勃发展的零工经济。Scale AI和Encord等数据公司正在招募自己的数据采集大军，而DoorDash则付费让外卖骑手拍摄自己做家务的视频。在中国，数十个国有机器人训练中心的工作人员佩戴虚拟现实头盔和外骨骼，教人形机器人如何打开微波炉和擦桌子。

"需求很大，而且增长非常快，"Micro1的首席执行官阿里·安萨里说。他估计，机器人公司现在每年花费超过1亿美元从他的公司及类似公司购买现实世界数据。

日常工作一瞥
Micro1的工人由名为扎拉的AI代理进行审核，该代理进行面试并审查家务视频样本。每周，他们按照一系列指令（如保持双手可见、以自然速度移动）提交自己在家做家务的视频。视频由AI和人工共同审核，决定是否通过。然后，AI和一支由数百人组成的团队对视频进行标注，标记镜头中的动作。

由于这种训练机器人的方法尚处于起步阶段，尚不清楚什么才是好的训练数据。尽管如此，"你需要提供大量多样化的数据，让机器人能够很好地泛化，以完成基本的世界导航和操作，"安萨里说。

但许多工人表示，在他们狭小的家中创造多样化的"家务内容"是一项挑战。住在简陋单间公寓的勤奋学生宙斯，除了每天熨衣服外，很难录制其他内容。印度德里的家庭教师阿尔琼制作一段15分钟的视频需要一小时，因为他花大量时间构思新的家务内容。

"家里能制作多少内容？有多少内容？"他说。

还有隐私这个棘手问题。Micro1要求工人不要将脸暴露在镜头前或透露姓名、电话号码和出生日期等个人信息。然后使用AI和人工审核员清除任何泄露的信息。

但即使没有面部，这些视频也捕捉了工人生活的私密片段：他们的家居内部、物品和日常习惯。而且，了解他们在镜头前忙于做家务时可能录入了哪些个人信息可能很棘手。对此类视频的审查可能无法过滤掉超出最明显标识符的敏感信息。

对于有家庭的工人来说，让私生活远离镜头是一场持续的协商。两个女儿的父亲阿尔琼不得不把他两岁孩子混乱的场面挡在镜头外。"有时工作非常困难，因为我的女儿还小，"他说。

尼日利亚的银行职员转行做数据采集员的萨莎，在共享住宅区的室外晾衣服时踮着脚尖走动，以免录到邻居，邻居们困惑地看着她。

尽管接受《麻省理工科技评论》采访的工人明白他们的数据正被用于训练机器人，但他们没有人确切知道他们的数据将如何被使用、存储以及与第三方（包括Micro1出售数据的机器人公司）共享。安萨里说，出于保密原因，Micro1不会透露客户名称，也不会向工人透露他们参与项目的具体性质。

"重要的是，如果工人从事这项工作，公司本身应告知他们意图……这类技术可能的发展方向以及长期可能对他们产生的影响，"马里兰大学以人为本计算教授亚斯明·科图里说。

一些工人表示，他们偶尔会在公司Slack频道上看到其他工人询问公司是否可以删除他们的数据。Micro1拒绝就此类数据是否被删除发表评论。

"人们是自愿选择做这份工作的，"安萨里说，"他们可以随时停止工作。"

数据饥渴
由于数千名工人在不同的家中以不同的方式做家务，一些机器人专家怀疑从他们那里收集的数据是否足够可靠，以安全地训练机器人。

"从安全角度来看，我们在家中的生活方式并不总是正确的，"ASTM International的机器人专家亚伦·普拉瑟说，"如果那些人教的是可能导致事故的坏习惯，那就不是好数据。"而且收集的数据量巨大，使得质量控制审查具有挑战性。但安萨里表示，公司会拒绝显示不安全操作方式的视频，而笨拙的动作可以用于教导机器人不应做什么。

还有一个问题是我们需要多少这样的数据。Micro1表示它拥有数万小时的视频，而Scale AI宣布已收集超过10万小时。

"要达到目标需要很长时间，"加州大学伯克利分校的机器人专家肯·戈德堡说。大语言模型训练所用的文本和图像数据，人类需要10万年才能读完，而人形机器人可能需要更多数据，因为控制机器人关节比生成文本更复杂。"这需要的时间比人们想象的要长，"他说。

当在印度繁华科技中心生活的工程专业学生达图在大学上了一整天的课回到家后，他会跳过晚餐，冲到他摆满盆栽和哑铃的小阳台上。他将iPhone固定在额头上，反复录制自己折叠同一套衣服的视频。

他的家人困惑地盯着他。"这对他们来说就像某种太空技术，"他说。当他告诉朋友们他的工作时，"他们只是对通过录制家务就能获得报酬的想法感到震惊。"

在大学学习、数据采集以及其他数据标注零工之间奔波，让他疲惫不堪。尽管如此，"感觉就像你在做与全世界不同的事情，"他说。

深度阅读
人工智能
"QuitGPT"运动敦促人们取消ChatGPT订阅
对ICE的抵制正在推动一场更广泛的运动，反对AI公司与特朗普总统的联系。
Moltbook是AI戏剧的巅峰
这个病毒式传播的机器人社交网络，更多地揭示了我们对AI的当前狂热，而非智能体的未来。
OpenAI正全力打造全自动研究员
与OpenAI首席科学家雅库布·帕乔基的独家对话，探讨公司的新重大挑战及AI的未来。
《Pokémon Go》如何为送货机器人提供精准的世界视图
独家报道：Niantic的AI子公司正利用玩家众包的300亿张城市地标图像训练新的世界模型。

保持联系
获取《麻省理工科技评论》的最新动态
发现特别优惠、头条新闻、即将举办的活动等。

英文来源：

The gig workers who are training humanoid robots at home
People in Nigeria and India are strapping iPhones onto their heads and recording themselves doing chores.
When Zeus, a medical student living in a hilltop city in central Nigeria, returns to his studio apartment from a long day at the hospital, he turns on his ring light, straps his iPhone to his forehead, and starts recording himself. He raises his hands in front of him like a sleepwalker and puts a sheet on his bed. He moves slowly and carefully to make sure his hands stay within the camera frame.
Zeus is a data recorder for Micro1, a US company based in Palo Alto, California that collects real-world data to sell to robotics companies. As companies like Tesla, Figure AI, and Agility Robotics race to build humanoids—robots designed to resemble and move like humans in factories and homes—videos recorded by gig workers like Zeus are becoming the hottest new way to train them.
Micro1 has hired thousands of contract workers in more than 50 countries, including India, Nigeria, and Argentina, where swathes of tech-savvy young people are looking for jobs. They’re mounting iPhones on their heads and recording themselves folding laundry, washing dishes, and cooking. The job pays well by local standards and is boosting local economies, but it raises thorny questions around privacy and informed consent. And the work can be challenging at times—and weird.
Zeus found the job in November, when people started talking about it everywhere on LinkedIn and YouTube. “This would be a real nice opportunity to set a mark and give data that will be used to train robots in the future,” he thought.
Zeus is paid $15 an hour, which is good income in Nigeria’s strained economy with high unemployment rates. But as a bright-eyed student dreaming of becoming a doctor, he finds ironing his clothes for hours every day boring.
“I really [do] not like it so much,” he says. “I’m the kind of person that requires … a technical job that requires me to think.”
Zeus, and all the workers interviewed by MIT Technology Review, asked to be referred to only by pseudonyms because they were not authorized to talk about their work.
Humanoid robots are notoriously hard to build because manipulating physical objects is a difficult skill to master. But the rise of large language models underlying chatbots like ChatGPT has inspired a paradigm shift in robotics. Just as large language models learned to generate words by being trained on vast troves of text scraped from the internet, many researchers believe that humanoid robots can learn to interact with the world by being trained on massive amounts of movement data.
Editor’s note: In a recent poll, MIT Technology Review readers selected humanoid robots as the 11th breakthrough for our 2026 list of 10 Breakthrough Technologies.
Robotics requires far more complex data about the physical world, though, and that is much harder to find. Virtual simulations can train robots to perform acrobatics, but not how to grasp and move objects, because simulations struggle to model physics with perfect accuracy. For robots to work in factories and serve as housekeepers, real-world data, however time-consuming and expensive to collect, may be what we need.
Investors are pouring money feverishly into solving this challenge, spending over $6 billion on humanoid robots in 2025. And at-home data recording is becoming a booming gig economy around the world. Data companies like Scale AI and Encord are recruiting their own armies of data recorders, while DoorDash pays delivery drivers to film themselves doing chores. And in China, workers in dozens of state-owned robot training centers wear virtual-reality headsets and exoskeletons to teach humanoid robots how to open a microwave and wipe down the table.
“There is a lot of demand, and it’s increasing really fast,” says Ali Ansari, CEO of Micro1. He estimates that robotics companies are now spending more than $100 million each year to buy real-world data from his company and others like it.
A day in the life
Workers at Micro1 are vetted by an AI agent named Zara that conducts interviews and reviews samples of chore videos. Every week, they submit videos of themselves doing chores around their homes, following a list of instructions about things like keeping their hands visible and moving at natural speed. The videos are reviewed by both AI and a human and are either accepted or rejected. They’re then annotated by AI and a team of hundreds of humans who label the actions in the footage.
Because this approach to training robots is in its infancy, it’s not clear yet what makes good training data. Still, “you need to give lots and lots of variations for the robot to generalize well for basic navigation and manipulation of the world,” says Ansari.
But many workers say that creating a variety of “chore content” in their tiny homes is a challenge. Zeus, a scrappy student living in a humble studio, struggles to record anything beyond ironing his clothes every day. Arjun, a tutor in Delhi, India, takes an hour to make a 15-minute video because he spends so much time brainstorming new chores.
“How much content [can be made] in the home? How much content?” he says.
There’s also the sticky question of privacy. Micro1 asks workers not to show their faces to the camera or reveal personal information such as names, phone numbers, and birth dates. Then it uses AI and human reviewers to remove anything that slips through.
But even without faces, the videos capture an intimate slice of workers’ lives: the interiors of their homes, their possessions, their routines. And understanding what kind of personal information they might be recording while they’re busy doing chores on camera can be tricky. Reviews of such footage might not filter out sensitive information beyond the most obvious identifiers.
For workers with families, keeping private life off camera is a constant negotiation. Arjun, a father of two daughters, has to wrangle his chaotic two-year-old out of frame. “Sometimes it’s very difficult to work because my daughter is small,” he says.
Sasha, a banker turned data recorder in Nigeria, tiptoes around when she hangs her laundry outside in a shared residential compound so she won’t record her neighbors, who watch her in bewilderment.
While the workers interviewed by MIT Technology Review understand that their data is being used to train robots, none of them know how exactly their data will be used, stored, and shared with third parties, including the robotics companies that Micro1 is selling the data to. For confidentiality reasons, says Ansari, Micro1 doesn’t name its clients or disclose to workers the specific nature of the projects they are contributing to.
“It is important that if workers are engaging in this, that they are informed by the companies themselves of the intention … where this kind of technology might go and how that might affect them longer term,” says Yasmine Kotturi, a professor of human-centered computing at the University of Maryland.
Occasionally, some workers say, they’ve seen other workers asking on the company Slack channel if the company could delete their data. Micro1 declined to comment on whether such data is deleted.
“People are opting into doing this,” says Ansari. “They could stop the work at any time.”
Hungry for data
With thousands of workers doing their chores differently in different homes, some roboticists wonder if the data collected from them is reliable enough to train robots safely.
“How we conduct our lives in our homes is not always right from a safety point of view,” says Aaron Prather, a roboticist at ASTM International. “If those folks are teaching those bad habits that could lead to an incident, then that’s not good data.” And the sheer volume of data being collected makes reviewing it for quality control challenging. But Ansari says the company rejects videos showing unsafe ways of performing a task, while clumsy movements can be useful to teach robots what not to do.
Then there’s the question of how much of this data we need. Micro1 says it has tens of thousands of hours of footage, while Scale AI announced it had gathered more than 100,000 hours.
“It’s going to take a long time to get there,” says Ken Goldberg, a roboticist at the University of California, Berkeley. Large language models were trained on text and images that would take a human 100,000 years to read, and humanoid robots may need even more data, because controlling robotic joints is even more complicated than generating text. “It’s going to take longer than people think,” he says.
When Dattu, an engineering student living in a bustling tech hub in India, comes home after a full day of classes at his university, he skips dinner and dashes to his tiny balcony, cramped with potted plants and dumbbells. He straps his iPhone to his forehead and records himself folding the same set of clothes over and over again.
His family stares at him quizzically. “It’s like some space technology for them,” he says. When he tells his friends about his job, “they just get astounded by the idea that they can get paid by recording chores.”
Juggling his university studies with data recording, as well as other data annotation gigs, takes a toll on him. Still, “it feels like you’re doing something different than the whole world,” he says.
Deep Dive
Artificial intelligence
A “QuitGPT” campaign is urging people to cancel their ChatGPT subscriptions
Backlash against ICE is fueling a broader movement against AI companies’ ties to President Trump.
Moltbook was peak AI theater
The viral social network for bots reveals more about our own current mania for AI as it does about the future of agents.
OpenAI is throwing everything into building a fully automated researcher
An exclusive conversation with OpenAI’s chief scientist, Jakub Pachocki, about his firm's new grand challenge and the future of AI.
How Pokémon Go is giving delivery robots an inch-perfect view of the world
Exclusive: Niantic's AI spinout is training a new world model using 30 billion images of urban landmarks crowdsourced from players.
Stay connected
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.

MIT科技评论

文章目录

📚 推荐阅读

扫描二维码，在手机上阅读