为何人们对人工智能的看法如此分歧?

qimuai 发布于 阅读:19 一手编译

为何人们对人工智能的看法如此分歧?

内容来源:https://www.technologyreview.com/2026/04/13/1135720/why-opinion-on-ai-is-so-divided/

内容总结:

斯坦福AI指数报告揭示行业认知鸿沟:专家与公众对AI看法迥异,重度用户体验领先

根据斯坦福大学最新发布的《2026年人工智能指数报告》,当前人工智能领域呈现显著的矛盾与认知分化。报告指出,AI的实际能力与公众感知之间存在巨大落差,而这一现象与不同群体对AI技术的使用深度密切相关。

专家与公众观点割裂明显
报告数据显示,在评估AI对就业的影响时,73%的美国AI专家持积极态度,而公众中仅有23%表示乐观,两者差距高达50个百分点。在经济和医疗等领域,类似的认知鸿沟同样存在。报告定义的“专家”主要为2023至2024年参与AI学术会议的研究人员。

“锯齿状前沿”导致体验分化
分析认为,认知差异源于不同用户群体的实际使用体验。当前顶尖AI模型在编程、数学等结构化任务上表现突飞猛进,重度用户(如每月支付高额费用使用专业编码、研究工具的开发者和研究人员)能直接感受到技术的飞跃。然而,在更开放或日常任务中(如读取模拟时钟),AI仍频繁出现低级错误。这种能力的不均衡被形容为“锯齿状前沿”——模型在某些领域极强,在另一些领域却表现平平。

硬件供应链高度集中埋下风险
报告同时警示了AI产业的潜在脆弱性:全球领先的AI芯片几乎全部由台积电一家工厂制造,这使得整个AI硬件供应链过度依赖于单一生产基地。

结论:两种现实并存
业界观察指出,当前AI发展呈现双重现实:一方面,其在专业领域的进步远超许多人的认知;另一方面,它在众多日常应用场景中仍不尽如人意。这种分化或将继续影响公众讨论与产业决策,任何关于AI未来的判断都需兼顾这两个层面。

中文翻译:

为何人们对人工智能的看法如此两极分化
人工智能的重度用户正与其他人拉开差距。
本文原载于我们的每周人工智能通讯《算法》。若想第一时间在收件箱中读到此类报道,请在此处订阅。
在一个瞬息万变的行业中,斯坦福大学发布的年度人工智能关键成果与趋势汇总报告《人工智能指数》恰如一次喘息之机。(毕竟,这是一场马拉松,而非短跑。)
今日发布的今年报告充满了引人注目的数据。其价值很大程度上在于用数字印证了人们可能已有的直觉——例如美国正以远超他国的力度推进人工智能:该国拥有5,427个数据中心(且数量仍在增长),是其他任何国家的十倍以上。
报告也提醒我们,人工智能产业依赖的硬件供应链存在若干关键瓶颈。或许最令人震惊的事实是:“全球几乎所有领先的人工智能芯片都由台积电一家公司制造,这使得全球人工智能硬件供应链依赖于台湾的一家代工厂。”仅此一家代工厂!这实在令人难以置信。
但2026年《人工智能指数》报告给我的主要启示是:当前人工智能的发展状态充满了矛盾。正如我的同事米歇尔·金今日在关于此报告的报道中所言:“如果你关注人工智能新闻,可能会感到无所适从。人工智能是淘金热。人工智能是泡沫。人工智能将取代你的工作。人工智能甚至看不懂钟表。”(斯坦福报告指出,谷歌DeepMind的顶级推理模型“双子深度思考”在国际数学奥林匹克竞赛中获得金牌,却有一半时间无法读取模拟时钟。)
米歇尔对报告亮点的报道十分精彩。但我想深入探讨一个萦绕心头的问题:为何我们如此难以准确把握人工智能的现状?
最显著的鸿沟似乎存在于专家与非专家之间。“人工智能专家与公众对该技术发展轨迹的看法大相径庭,”《人工智能指数》的作者写道。“在评估人工智能对就业的影响时,73%的美国专家持积极态度,而公众中仅有23%持相同看法,差距达50个百分点。在经济和医疗领域也出现了类似的分歧。”
这差距可谓巨大。原因何在?专家掌握了哪些公众不了解的信息?(此处的“专家”指2023年和2024年参与人工智能会议的美国研究人员。)
我认为部分原因在于,专家与非专家的观点基于截然不同的体验。“你对人工智能的惊叹程度,与你使用人工智能编程的频率完全成正比,”一位软件开发人员近日在X平台上写道。这话或许带着调侃,但确实揭示了一些真相。
顶尖实验室的最新模型在生成代码方面已达到前所未有的水平。由于编程等技术任务存在明确的对错标准,相较于开放型任务,训练模型完成此类工作更为容易。此外,能够编程的模型已被证明具有盈利潜力,因此模型开发者正投入大量资源进行改进。
这意味着,使用这些工具进行编程或其他技术工作的人群正在体验该技术最卓越的一面。而在其他应用场景中,结果则良莠不齐。大型语言模型仍会犯低级错误。这种现象被称为“锯齿状前沿”:模型在某些领域表现卓越,在其他领域则不尽如人意。
颇具影响力的人工智能研究员安德烈·卡帕西也提出了见解。“根据我的[时间线]判断,人们对人工智能能力的理解差距正在扩大,”他在回复上述X帖文时写道。他指出,重度用户(即使用大型语言模型进行编码、数学或研究的人群)不仅紧跟最新模型动态,还常每月支付200美元使用最优版本。“今年以来这些领域的进展简直令人震惊,”他补充道。
由于大型语言模型仍在快速进化,付费使用Claude Code的用户实际体验的技术,与半年前尝试用免费版Claude策划婚礼的用户已截然不同。这两类人群的认知已产生错位。
这给我们带来什么启示?我认为存在两种现实。一方面,人工智能确实远比许多人认知的更强大;另一方面,它在许多人们关注的领域仍表现欠佳(且可能持续如此)。任何对未来下注的人都应牢记这一点。

深度解析
人工智能
OpenAI正全力打造全自动研究员
独家对话OpenAI首席科学家雅各布·帕乔斯基,探讨该公司的新宏伟挑战与人工智能的未来。
《精灵宝可梦Go》如何为配送机器人提供厘米级精度的世界视图
独家报道:Niantic的人工智能子公司正利用玩家众包的300亿张城市地标图像训练全新世界模型。
这家初创公司欲改变数学家的研究方式
Axiom Math正在免费提供一款强大的新型人工智能工具。但它能否如公司所愿加速研究进程,仍有待观察。
人工智能基准测试已失效,我们需要什么替代方案
一次性测试无法衡量人工智能的真实影响。我们亟需转向更以人为本、贴合具体情境的评估方法。

保持联系
获取《麻省理工科技评论》最新动态
探索特别优惠、头条新闻、即将举办的活动等更多内容。

英文来源:

Why opinion on AI is so divided
AI power users are pulling away from everyone else.
This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.
In an industry that doesn’t stand still, Stanford’s AI Index, an annual roundup of key results and trends, is a chance to take a breath. (It’s a marathon, not a sprint, after all.)
This year’s report, which dropped today, is full of striking stats. A lot of the value comes from having numbers to back up gut feelings you might already have, such as the sense that the US is gunning harder for AI than everyone else: It hosts 5,427 data centers (and counting). That’s more than 10 times as many as any other country.
There’s also a reminder that the hardware supply chain the AI industry relies on has some major choke points. Here’s perhaps the most remarkable fact: “A single company, TSMC, fabricates almost every leading AI chip, making the global AI hardware supply chain dependent on one foundry in Taiwan.” One foundry! That’s just wild.
But the main takeaway I have from the 2026 AI Index is that the state of AI right now is shot through with inconsistencies. As my colleague Michelle Kim put it today in her piece about the report: “If you’re following AI news, you’re probably getting whiplash. AI is a gold rush. AI is a bubble. AI is taking your job. AI can’t even read a clock.” (The Stanford report notes that Google DeepMind’s top reasoning model, Gemini Deep Think, scored a gold medal in the International Math Olympiad but is unable to read analog clocks half the time.)
Michelle does a great job covering the report’s highlights. But I wanted to dwell on a question that I can’t shake. Why is it so hard to know exactly what’s going on in AI right now?
The widest gap seems to be between experts and non-experts. “AI experts and the general public view the technology’s trajectory very differently,” the authors of the AI Index write. “Assessing AI’s impact on jobs, 73% of U.S. experts are positive, compared with only 23% of the public, a 50 percentage point gap. Similar divides emerge with respect to the economy and medical care.”
That’s a huge gap. What’s going on? What do experts know that the public doesn’t? (“Experts” here means US-based researchers who took part in AI conferences in 2023 and 2024.)
I suspect part of what’s going on is that experts and non-experts base their views on very different experiences. “The degree to which you are awed by AI is perfectly correlated with how much you use AI to code,” a software developer posted on X the other day. Maybe that’s tongue-in-cheek, but there’s definitely something to it.
The latest models from the top labs are now better than ever at producing code. Because technical tasks like coding have right or wrong results, it is easier to train models to do them, compared with tasks that are more open-ended. What’s more, models that can code are proving to be profitable, so model makers are throwing resources at improving them.
This means that people who use those tools for coding or other technical work are experiencing this technology at its best. Outside of those use cases, you get more of a mixed bag. LLMs still make dumb mistakes. This phenomenon has become known as the “jagged frontier”: Models are very good at doing some things and less good at others.
The influential AI researcher Andrej Karpathy also had some thoughts. “Judging by my [timeline] there is a growing gap in understanding of AI capability,” he wrote in reply to that X post. He noted that power users (read: people who use LLMs for coding, math, or research) not only keep up to date with the latest models but will often pay $200 a month for the best versions. “The recent improvements in these domains as of this year have been nothing short of staggering,” he continued.
Because LLMs are still improving fast, someone who pays to use Claude Code will in effect be using a different technology from someone who tried using the free version of Claude to plan a wedding six months ago. Those two groups are speaking past each other.
Where does that leave us? I think there are two realities. Yes, AI is far better than a lot of people realize. And yes, it is still pretty bad at a lot of stuff that a lot of people care about (and it may stay that way). Anyone making bets about the future on either side should bear that in mind.
Deep Dive
Artificial intelligence
OpenAI is throwing everything into building a fully automated researcher
An exclusive conversation with OpenAI’s chief scientist, Jakub Pachocki, about his firm's new grand challenge and the future of AI.
How Pokémon Go is giving delivery robots an inch-perfect view of the world
Exclusive: Niantic's AI spinout is training a new world model using 30 billion images of urban landmarks crowdsourced from players.
This startup wants to change how mathematicians do math
Axiom Math is giving away a powerful new AI tool. But it remains to be seen if it speeds up research as much as the company hopes.
AI benchmarks are broken. Here’s what we need instead.
One-off tests don’t measure AI’s true impact. We’re better off shifting to more human-centered, context-specific methods.
Stay connected
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.

MIT科技评论

文章目录


    扫描二维码,在手机上阅读