从炒作到盈利之间缺失的一环

内容来源:https://www.technologyreview.com/2026/04/27/1136456/the-missing-step-between-hype-and-profit/
内容总结:
从炒作到盈利:人工智能的“第二步”困境
从伦敦反AI抗议传单上的讽刺标语,到硅谷科技巨头的宏大叙事,人工智能行业正面临一个关键问题:技术已就位(第一步),盈利前景被描绘(第三步),但中间如何实现(第二步)仍是一个巨大的问号。
近期两项研究揭示了这一鸿沟。Anthropic预测哪些岗位将受大语言模型冲击,但其结论更多基于模型擅长什么,而非实际工作表现。而Mercor对顶级AI代理的测试显示,这些系统在480项银行、咨询、法律等常见任务中,大多未能完成职责。这表明,即便最优秀的AI,在职场中的经济可行性依然存疑。
分析认为,这种分歧源于多重因素:利益相关方的立场影响预测倾向;多数乐观判断建立在AI编程能力快速提升上,但并非所有任务都能通过编程解决;更关键的是,AI工具并非在真空中运行,需与现有人员和工作流程融合,有时反而会降低效率。
当前,缺乏对“第二步”的共识已形成信息真空,被各种耸人听闻的单一声明所填充,甚至能撼动市场。要填平这道鸿沟,需要模型开发商提高透明度、科研与企业加强协作,以及建立能真实反映AI落地效果的新评估体系。
正如那场抗议传单所呼吁的:“在搞清楚第二步到底是什么之前,请暂停AI发展。”如今,大多数企业仍在纠结:面对已经到手的“底裤”(技术),下一步到底该怎么办?
中文翻译:
炒作与盈利之间的缺失环节
抛开编程不谈,即便是最优秀的人工智能系统,在职场上也难以实现经济可行性。那接下来会发生什么?
本文最初发表于我们的AI周报《算法》。想第一时间在收件箱里收到此类文章,请在此处订阅。
今年二月,我在伦敦一场反AI游行中捡到一张传单。我不确定作者是否有意模仿《南方公园》里的“内裤精灵”,但如果他们真是这么想的,那他们确实成功了。传单上写着:“第一步:培养数字超级大脑。第二步:?第三步:?”
这张传单由国际行动组织“暂停AI”制作,该组织是这次抗议活动的联合主办方。传单最后向读者恳求道:“在我们搞清楚第二步究竟是什么之前,请暂停AI。”
在1998年首播的《南方公园》剧集《精灵》中,肯尼、凯尔、卡特曼和斯坦发现了一群精灵,它们夜里偷偷溜出来,从抽屉里偷内裤。为什么?精灵们展示了它们的宣传方案:“第一阶段:收集内裤。第二阶段:?第三阶段:盈利。”
此后,精灵们的商业计划成了互联网迷因中的经典之一,被用来嘲讽从初创公司策略到政策提案的各种事物。迷因之王埃隆·马斯克本人曾在一场演讲中引用它,谈自己打算如何为火星任务筹集资金。如今,这个计划恰好概括了AI的现状。公司们已经构建了技术(第一步),并承诺了变革(第三步)。但如何实现仍然是个巨大的问号。
对“暂停AI”而言,第二步必须涉及某种监管。但具体需要什么措施、由谁来执行,仍存争议。
另一方面,AI的推崇者坚信第三步就是救赎,他们往往对中间环节一带而过。他们看到我们正骑着一项“经济变革性技术”奔向阳光明媚的高地,正如OpenAI首席科学家雅库布·帕乔茨基几周前对我说的那样。他们或多或少知道自己想去哪里——那里雾蒙蒙的,而且还有一段距离。但每个人都在走不同的路。他们都能到达吗?有人能到达吗?
每一个关于未来的大话背后,都有一番更冷静的现实评估,来抑制炒作。看看最近两项研究。一项来自Anthropic,预测了哪些类型的工作最受大语言模型影响。(结论:管理者、建筑师和媒体从业者应做好准备;园林工人、建筑工人和酒店从业者则相对影响较小。)但他们的预测实际上只是猜测,基于大语言模型擅长什么任务,而非它们在职场中的实际表现。
另一项研究由AI招聘初创公司Mercor的研究人员在二月发布,测试了由OpenAI、Anthropic和Google DeepMind顶尖模型驱动的几个AI代理,执行人类银行家、顾问和律师经常完成的480项职场任务。他们测试的每个代理都无法完成大多数职责。
为何存在如此大的分歧?这涉及多个因素。首先,重要的是考虑谁在提出这些说法(以及为什么)。Anthropic是利益相关方。此外,大多数告诉我们重大变革即将发生的人,主要是基于AI编程工具进步的速度得出这一结论。但并非所有任务都能通过编程解决。例如,其他研究发现大语言模型在做出战略判断方面表现不佳。
而且,当这些工具部署时,并非简单地扔进一个真空环境。它们需要在充满人类和工作流程的地方运行。有时,加入AI反而会让事情更糟。当然,也许这些工作流程需要被拆解并围绕新技术重新设计,才能实现变革性地位,但这需要时间(和胆量)。
那个大窟窿呢?它正好在第二步该在的位置。对即将发生什么以及如何发生缺乏共识,造成了信息真空,被每周最新的大胆说法所填满,证据全抛脑后。我们与对未来的真正理解脱节如此之深,以至于一条社交媒体帖子就能(也确实)撼动市场。
我们需要更少的猜测和更多的证据。但这需要模型制造商的透明度、研究人员与企业之间的协调,以及评估这项技术的新方法,来揭示它在现实世界中部署时的真实情况。
科技行业(以及整个世界经济)都建立在AI真正具有变革性的承诺之上。但这还不是一个稳赢的赌注。下次你听到关于未来的大胆说法时,请记住:大多数企业还在琢磨怎么处理他们的内裤。
深度探索
人工智能
OpenAI正全力构建全自动研究员
独家对话OpenAI首席科学家雅库布·帕乔茨基,谈及公司的新宏大挑战与AI的未来。
《宝可梦GO》如何让配送机器人获得精准的世界视野
独家:Niantic的AI衍生公司正在利用从玩家众包中收集的300亿张城市地标图像,训练一个新的世界模型。
想了解AI的现状?来看看这些图表。
根据斯坦福2026年AI指数,AI正在飞速发展,而我们难以跟上。
这家初创公司想改变数学家的工作方式
Axiom Math正在免费提供一款强大的新AI工具。但它能否像公司希望的那样加速研究,仍有待观察。
保持联系
获取来自《麻省理工科技评论》的最新动态
发现特别优惠、热门故事、即将举行的活动等更多内容。
英文来源:
The missing step between hype and profit
Coding aside, even the best AI systems struggle to be economically viable in the workplace. What happens then?
This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.
In February, I picked up a flyer at an anti-AI march in London. I can’t say for sure whether or not its writers meant to riff on South Park’s underpants gnomes. But if they did, they nailed it: “Step 1: Grow a digital super mind,” it read. “Step 2: ? Step 3: ?”
Produced by Pause AI, an international activist group that co-organized the protest, it ended with this plea to the reader: “Pause AI until we know what the hell Step 2 is.”
In the South Park episode “Gnomes,” which first aired in 1998, Kenny, Kyle, Cartman, and Stan discover a community of gnomes that sneak out at night to steal underpants from dressers. Why? The gnomes present their pitch deck. “Phase 1: Collect underpants. Phase 2: ? Phase 3: Profit.”
The gnomes’ business plan has since become one of the greats among internet memes, used to satirize everything from startup strategies to policy proposals. Memelord in chief Elon Musk once invoked it in a talk about how he planned to fund a mission to Mars. Right now, it captures the state of AI. Companies have built the tech (Step 1) and promised transformation (Step 3). How they get there is still a big question mark.
As far as Pause AI is concerned, Step 2 must involve some kind of regulation. But exactly what it will call for and who will enforce it are up for debate.
AI boosters, on the other hand, are convinced that Step 3 is salvation and tend to glaze over the middle bit. They see us racing toward sunny uplands on the back of an “economically transformative technology,” as OpenAI’s chief scientist, Jakub Pachocki, put it to me a few weeks ago. They know where they want to go—more or less: It’s hazy up there and still some way off. But everyone’s taking a different route. Will they all make it? Will anyone?
For every big claim about the future, there is a more sober assessment of how the rubber meets the road—one that quells the hype. Consider two recent studies. One, from Anthropic, predicted what types of jobs are going to be most affected by LLMs. (A takeaway: Managers, architects, and people in the media should prepare for change; groundskeepers, construction workers, and those in hospitality, not so much.) But their predictions are really just guesses, based on what kinds of tasks LLMs seem to be good at rather than how they really perform in the workplace.
Another study, put out in February by researchers at Mercor, an AI hiring startup, tested several AI agents powered by top-tier models from OpenAI, Anthropic, and Google DeepMind on 480 workplace tasks frequently carried out by human bankers, consultants, and lawyers. Every agent they tested failed to complete most of its duties.
Why is there such wide disagreement? There are a number of factors. For a start, it’s crucial to consider who is making the claims (and why). Anthropic has skin in the game. What’s more, most of the people telling us that something big is about to happen have reached that conclusion largely on the basis of how fast AI coding tools are getting. But not all tasks can be hacked with coding. Other studies have found that LLMs are bad at making strategic judgment calls, for example.
What’s more, when they’re deployed, the tools aren’t just dropped into a cleanroom. They need to work in places contaminated with people and existing workflows. And sometimes adding AI will make things worse. Sure, maybe those workflows need to be torn up and refashioned around the new technology for it to achieve transformative status, but that will take time (and guts).
That big hole? It’s right where Step 2 should be. The lack of agreement on exactly what’s about to happen—and how—creates an information vacuum that gets filled by the latest wild claim of the week, evidence be damned. We’re so unmoored from any real understanding of what’s coming and how it will be deployed that a single social media post can (and does) shake markets.
We need fewer guesses and more evidence. But that’s going to require transparency from the model makers, coordination between researchers and businesses, and new ways to evaluate this technology that tell us what really happens when it’s rolled out in the real world.
The tech industry (and with it the world’s economy) rests on the held-out promise that AI really will be transformative. But that is not yet a sure bet. Next time you hear bold claims about the future, remember that most businesses are still figuring out what to do with their underpants.
Deep Dive
Artificial intelligence
OpenAI is throwing everything into building a fully automated researcher
An exclusive conversation with OpenAI’s chief scientist, Jakub Pachocki, about his firm's new grand challenge and the future of AI.
How Pokémon Go is giving delivery robots an inch-perfect view of the world
Exclusive: Niantic's AI spinout is training a new world model using 30 billion images of urban landmarks crowdsourced from players.
Want to understand the current state of AI? Check out these charts.
According to Stanford’s 2026 AI Index, AI is sprinting, and we’re struggling to keep up.
This startup wants to change how mathematicians do math
Axiom Math is giving away a powerful new AI tool. But it remains to be seen if it speeds up research as much as the company hopes.
Stay connected
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.