Meta的这次黑客事件表明，人工智能安全涉及的问题远不止神话传说那么简单。

qimuai 发布于 2026-6-6 07:00 阅读：24 一手编译

内容来源：https://www.technologyreview.com/2026/06/05/1138437/the-meta-hack-shows-theres-more-to-ai-security-than-mythos/

内容总结：

Meta AI客服漏洞致大量Instagram账户被盗，暴露AI安全短板

6月5日，科技媒体404 Media报道称，有攻击者利用Meta公司的人工智能客服代理系统成功窃取Instagram账户，甚至包括已停用的奥巴马白宫官方账号。攻击手法极其简单：攻击者直接要求AI客服将账户绑定至其控制的邮箱，而AI竟照单全办。被窃取的账户中，奥巴马账号被用于发布亲伊朗内容，其他拥有珍贵单字用户名的账户则可能被转卖牟利。

这一事件引发对AI系统安全性的广泛质疑。与人们普遍担忧的“超级AI攻击基础设施”不同，此次攻击中AI是受害者而非加害者，且手法远不及Anthropic公司此前警告的Mythos模型那般复杂。但专家指出，随着企业将更多工作流程交给AI，这种低技术含量的攻击同样能造成严重破坏。

杜克大学电气与计算机工程教授Neil Gong表示：“AI应用越广泛，尤其是用于自动账户恢复等流程时，攻击者就越有动机攻击AI本身。”他透露，学术界已多次警告AI代理的安全漏洞，包括“间接提示注入”等技术，但Meta的漏洞甚至无须复杂操作——攻击者只需使用与账户主人所在地匹配的VPN，直接要求AI修改邮箱即可得手。

乔治城大学安全与新兴技术中心高级研究员Jessica Ji直言：“这令人震惊，Meta作为兼具AI与网络安全专长的公司，竟未对此进行基本测试。”Meta未公开解释漏洞原因，但6月10日其发言人在X平台表示漏洞已修复。

专家分析，AI代理与传统软件不同，它能灵活应对新情况，但也因此容易上当。威斯康星大学麦迪逊分校计算机科学教授Somesh Jha比喻道：“AI代理就像急于完成作业的小学生，只想讨好老师，人类客服至少会问‘为什么要改邮箱’并核实身份。”

为降低风险，企业需为AI设置严格规则（如变更敏感信息前必须通过安全问题验证），并进行充分的“红队测试”——即开发者主动攻击系统以发现漏洞。然而，安全与性能存在天然矛盾：代理能力越强、限制越少，能承担的工作就越多。伊利诺伊大学香槟分校教授Bo Li指出：“安全性和实用性永远需要权衡。”此外，全面红队测试成本高昂，而攻击者只需找到一个漏洞即可获利。

尽管更先进的AI模型或许能识别可疑操作（如奥巴马账户的异常邮箱修改），且AI本身也可用于红队测试，但专家警告，AI代理的安全问题将日益严峻。在行业竞速中，企业为抢首发往往牺牲安全审查。Jha对此痛心疾首：“所有人都不想落后，就把产品匆忙推出，完全不顾严格测试——这极其危险。”

中文翻译：

Meta黑客事件揭示：AI安全远不止"神话"这一层面
部分AI网络安全威胁极其简单，但其危险性不容小觑。
6月5日，404 Media报道称，攻击者利用Meta的AI客服代理盗取Instagram账户。他们的手法很简单：要求该代理将账户关联至自己控制的邮箱地址，而代理照做了。一名攻击者入侵了奥巴马时期已停用的白宫账户，并发布支持伊朗的帖子；其他人则接管了拥有珍贵单字用户名的账户，可能意图转售。
AI网络安全问题并非新鲜事。自Anthropic在4月宣布其"神话"模型因黑客能力过强而无法向公众发布以来，评论员、研究人员及联邦官员都聚焦于一点：超级AI系统可能摧毁我们的计算机基础设施。而此次Instagram黑客事件并非如此——在这起事件中，AI是目标而非攻击者，且攻击手段远比"神话"能设计的任何方法简单。但随着企业将更多工作交由AI处理，这类相对初级的攻击同样可能引发混乱。
杜克大学电气与计算机工程教授龚尼尔表示："随着AI应用日益广泛——尤其是当AI被大量用于自动化工作流程（如账户恢复）时——我认为攻击者会更倾向于直接攻击AI本身。"
龚尼尔及其他学者早已就AI代理的安全漏洞发出警告。他们发表论文和博文，详细阐述间接提示注入等攻击手段——这类攻击通过隐藏在网站、邮件或其他看似无害数据源中的指令劫持AI代理。与之相比，Meta黑客事件几乎堪称"无脑操作"。攻击者唯一需克服的复杂性，是使用与真实账户所有者所在地匹配的VPN；随后直接要求客服代理更改账户邮箱，而代理照单全收。
Meta尚未公开评论该漏洞如何出现。但龚尼尔指出，鉴于利用手段之简单，本应在代理部署前就轻易发现此问题。他说："这实在令人震惊。我不理解他们为何没发现这个简单的问题。"
乔治城大学安全与新兴技术中心高级研究分析师杰西卡·季也表示认同，她提出疑问："这引发了一系列问题：难道连基本防护措施都没有吗？有人考虑过测试这类场景吗？"她指出，来自Meta这样兼具AI与网络安全专业知识的公司出现此类疏忽尤为引人注目。Meta未回应本文的置评请求，但周一该公司发言人在X平台上表示该漏洞已修复。
尽管此事可能让Meta格外难堪，却同时凸显所有AI代理共有的核心漏洞。与传统软件不同，代理能灵活——且出人意料地——应对新情况，这正是其可替代人工客服的原因。但AI代理也可能以人类不会受骗的方式被蒙蔽，且因其能执行真实操作，这些错误会带来实际后果。威斯康星大学麦迪逊分校计算机科学教授索梅什·杰哈指出："人类会问'为什么要改邮箱？'，可能还会要求回答安全问题。而AI代理的问题在于，它们太急于完成任务了——简直像急于讨好老师的小学生。"
降低风险是有办法的。企业可使用传统软件构建防护措施，确保代理遵守严格规则（例如在向新邮箱发送敏感账户信息前始终要求回答安全问题）。本文咨询的专家一致认为，代理应经过严格的红队测试——即开发者在部署前尽力攻击系统以发现漏洞的过程。
但反作用力同样存在。企业希望部署能力强大的代理，而代理权限越大——且所受防护措施越少——其可能承担的工作就越多。伊利诺伊大学厄巴纳-香槟分校计算机科学教授李波表示："安全性和实用性始终需要权衡。"充分的红队测试成本高昂。防御方需投入比攻击方更多的资源，因为攻击者只需发现一个漏洞，而防御者需尽可能发现并修补所有漏洞。当攻击者盯上如单字Instagram用户名般价值不菲的目标时，他们会倾注资源寻找漏洞，防御方则需投入更多资金保护这一目标。
随着AI模型持续进步，强化其防御或许反而更容易。尽管大语言模型的概率属性意味着LLM代理始终易受某些攻击，但更复杂的模型可能将更改奥巴马白宫账户邮箱的尝试识别为可疑操作。AI系统也可用于代理红队测试——正如Anthropic的"Project Glasswing"参与者利用"神话"模型识别其软件中的漏洞。
不过，专家预计确保AI代理安全的问题未来只会更加紧迫。随着代理能力增强，采用代理的企业可能希望赋予其更多权限——既为以更少人力提供更多服务，也为避免被竞争对手超越。在快速迭代的AI领域，花时间谨慎保护高风险代理系统可能被视为一种不合时宜的拖延。
杰哈警告："每个人都想争先，不经严格审查和红队测试就匆忙推出产品。我认为这是非常危险的做法。"

深度分析
人工智能
想了解AI现状？请看这些图表。
根据斯坦福2026年AI指数，AI正加速发展，我们难以跟上步伐。
当下AI领域最值得关注的10件事
《麻省理工科技评论》权威盘点2026年AI领域的10大技术、新兴趋势、大胆构想及重要变革。
马斯克诉奥特曼第一周：马斯克称受骗，警告AI可能毁灭人类，并承认xAI蒸馏了OpenAI的模型
马斯克保持冷静，而OpenAI律师以尖锐问题质询其起诉动机。
美国面向基督徒的新手机网络旨在屏蔽色情与性别相关内容
该套餐将于下周在T-Mobile网络上线，对网络安全采取"核武器"式策略。

保持联系
获取《麻省理工科技评论》最新资讯
发现特别优惠、头条新闻、即将举办的活动等更多内容。

英文来源：

The Meta hack shows there’s more to AI security than Mythos
Some AI cybersecurity threats are incredibly simple. They’re still dangerous.
On June 5, 404 Media reported that attackers had been using Meta’s AI customer support agent to steal Instagram accounts. Their approach was simple: They asked the agent to link the accounts to email addresses that they controlled, and the agent complied. One attacker broke into the dormant Obama White House account and made pro-Iran posts; others took over accounts with valuable, single-word handles, possibly in order to sell them.
AI cybersecurity concerns are nothing new. Since Anthropic announced in April that its Mythos model was too good at hacking to be released to the general public, commentators, researchers, and federal officials alike have fixated on the idea that superpowered AI systems could lay waste to our computer infrastructure. That’s not quite what this Instagram hack was: There, AI was the target rather than the attacker, and the method was far simpler than anything Mythos would cook up. But as companies offload more work to AI, these comparatively unsophisticated attacks could wreak their own havoc.
“As AI becomes more and more widely used—especially when AI is more and more widely used to automate our work flows, like account recovery—I think attackers are going to be more and more motivated to attack AI itself,” says Neil Gong, a professor of electrical and computer engineering at Duke University.
Gong and other scholars have been issuing warnings about the security vulnerabilities of AI agents for a while. They publish papers and blog posts detailing exploits such as indirect prompt injection, which involves hijacking agents using commands hidden in websites, emails, or other seemingly anodyne data sources. Compared with these techniques, the Meta hack was practically mindless. The only complication that hackers had to overcome was using a VPN that matched the true account owner’s location; then they directly asked the support agent to change the account’s email address, and it complied.
Meta has not commented publicly on how this vulnerability slipped through the cracks. But given the simplicity of the exploit, Gong says, it should have been uncovered easily, before the agent was deployed. “It’s really surprising,” he says. “I don’t understand why they didn’t find this simple problem.”
Jessica Ji, a senior research analyst at Georgetown’s Center for Security and Emerging Technology, agrees. “It raises questions like: Were there even guardrails in place?” she says. “Did anyone think to test for this kind of scenario?” She notes that the oversight is particularly striking coming from a company like Meta, which has extensive expertise in both AI and cybersecurity. Meta did not respond to a request for comment for this article, but on Monday a Meta spokesperson said on X that the vulnerability had been resolved.
As embarrassing a moment as this might be for Meta in particular, it also highlights some core vulnerabilities shared by all AI agents. Unlike traditional software, agents can respond in flexible—and unexpected—ways to new circumstances, which is why they might be able to substitute for human customer support agents. But AI agents can also be tricked in ways that humans wouldn’t be, and because they can take real-world actions, those mistakes have consequences. “A human would say, ‘Okay, why do you want to change the email address?’ and maybe respond with a security question,” says Somesh Jha, a professor of computer science at the University of Wisconsin–Madison. “What is going on with these agents is they’re very eager to finish the task. It’s almost like some elementary school student who just wants to please the teacher.”
There are ways to mitigate the risks. Companies can use traditional software to build guardrails that make sure agents follow strict rules, such as always asking for answers to security questions before sending sensitive account information to a new email address. And the experts consulted for this article all agree that agents should undergo rigorous red-teaming, a process in which developers try their best to attack a system in order to discover its vulnerabilities before it is deployed.
But there are also countervailing forces. Companies want to deploy capable agents, and the more power an agent has—and the fewer guardrails it is subject to—the more work it can potentially take on. “Security and utility always have a trade-off,” says Bo Li, a professor of computer science at the University of Illinois Urbana-Champaign. And adequate red-teaming can be expensive. Defenders have to expend more resources than attackers do, because attackers only need to discover a single exploit, while defenders try to discover and patch as many as they can. When attackers are working toward something as valuable as a single-word Instagram handle, they’ll pour resources into finding exploits, so defenders have to spend even more money to protect that prize.
As AI models continue to improve, hardening their defenses might actually get easier. Though the probabilistic nature of large language models means that LLM agents will always be vulnerable to some forms of attack, a more sophisticated model might have identified an attempt to change the email associated with the Obama White House account as suspicious. And AI systems can be used for agent red-teaming, much as participants in Anthropic’s Project Glasswing use Mythos to identify vulnerabilities in their software.
Still, experts expect that the problem of securing AI agents will only become more pressing in the future. As agents grow more capable, companies that adopt them may want to give them more power, both to provide more services with fewer humans and to avoid being left behind by their competitors. In the fast-moving world of AI, the time needed to carefully secure risky agentic systems might seem like an unconscionable delay.
“Everybody wants to be the first to do something and just push things out without careful scrutiny and red-teaming,” Jha says. “I think it’s a very dangerous thing.”
Deep Dive
Artificial intelligence
Want to understand the current state of AI? Check out these charts.
According to Stanford’s 2026 AI Index, AI is sprinting, and we’re struggling to keep up.
10 Things That Matter in AI Right Now
MIT Technology Review's authoritative overview of the 10 technologies, emerging trends, bold ideas, and powerful movements in AI in 2026.
Musk v. Altman week 1: Elon Musk says he was duped, warns AI could kill us all, and admits that xAI distills OpenAI’s models
Musk kept his cool, and OpenAI’s lawyer bulldozed him with piercing questions about his motivations for suing the company.
A new US phone network for Christians aims to block porn and gender-related content
Launching next week on T-Mobile's network, the cell plan takes a nuclear approach to online safety.
Stay connected
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.

MIT科技评论

文章目录

📚 推荐阅读

扫描二维码，在手机上阅读