无论发生什么,“危险”的人工智能模型都将到来

qimuai 发布于 阅读:13 一手编译

无论发生什么,“危险”的人工智能模型都将到来

内容来源:https://www.wired.com/story/dangerous-ai-models-are-coming-no-matter-what/

内容总结:

上周晚些时候,美国人工智能公司Anthropic应美国政府出口管制指令要求,将旗下新推出的Claude Fable 5和Mythos 5两款AI模型下线。该指令禁止“任何外国公民”使用相关服务。自上周五以来,Anthropic一直与白宫进行谈判,但尚未达成允许其恢复服务的协议。

自4月Mythos预览版发布以来,Anthropic就宣称并警告称,该模型不仅具备发现软件漏洞以帮助防御者修补的能力,还能找出可被恶意行为者利用的攻击手段,是一把“双刃剑”。公司在上周的博客中写道:“AI模型的大量高级用途都是双重性的:网络安全专家和生物学研究者的有益查询,若被恶意行为者获取,同样危险。”

为此,Anthropic最初仅向一个名为“Glasswing”的合作小组发布了Mythos预览版。上周,Mythos 5也仅向该小组私下发布,而Claude Fable 5作为Mythos级别的模型,则面向公众开放,但被设置了对生物和网络安全相关问题的回答限制。

然而,特朗普政府于上周末决定限制这两款模型,理由是认为Fable 5的安全护栏可被绕过,从而完全释放Mythos 5的能力,这被认为构成国家安全风险。

但专家指出,这场机构间的冲突只是拖延或掩盖了一个难以回避的事实:Anthropic或许只是当前的风口浪尖,但总体而言,多家公司和开源开发者开发的AI模型,在不久的将来几乎必然会具备与Mythos 5相似的能力——甚至可能已经具备。

专业网络安全咨询公司TPO集团首席安全官Tara Wheeler表示:“认为其他竞争对手不会开发出类似Mythos的能力,甚至认为他们还没做到,是极其短视的。其他公司正紧追Anthropic,很可能也已拥有类似能力,只是正在观望Anthropic在当前监管环境下的待遇而暂缓发布。”

Anthropic自Mythos预览版发布起就反复强调这一点。其前沿红队负责人Logan Graham在4月表示:“真正要传递的信息是,这无关模型或Anthropic本身。我们需要为6、12、24个月内这些能力将广泛普及的世界做好准备。”

例如,OpenAI也在4月中旬私下发布了专注于网络安全的模型,并公布了扩展后的网络安全战略。

研究人员指出,即便在当前这一代模型之前,现有AI通过精心调校,也能用于高级漏洞挖掘和利用开发。周日,一大批网络安全领袖联名致信政府,强调白宫的出口管制指令具有误导性。

哈佛大学和多伦多大学研究员Bruce Schneier分析称:“问题不在于某一个模型,而在于技术的总体趋势。更小、更便宜、开源的模型,有时单独使用,有时协同工作,通过更复杂的提示词就能达到Mythos/Fable的性能。我们预计几个月内——开源模型稍长一些——其他模型就会在创造力和持久性上与之匹敌。”

专家们表示,白宫及世界各国政府真正需要关注的是,民主地制定更广泛、更透明的计划,以应对AI能力在网络安全及其他敏感领域不可避免的进步。

云安全公司Veracode联合创始人Chris Wysopal指出:“政策问题不在于某项技术是否存在风险,而在于某项具体限制是否真的能有效降低风险,或者它主要只是拖慢了那些试图让系统更安全的人。”

中文翻译:

上周晚些时候,Anthropic 将其新款克莱德·寓言5和神话5人工智能模型下线,原因在于美国政府发布了一项出口管制指令,禁止“任何外国公民”使用这些服务。自上周五以来,该公司一直与白宫进行谈判,但尚未达成允许其恢复相关服务的协议。

自四月神话系列首次亮相以来,Anthropic 一直在宣称——同时也发出警告——该模型不仅具备先进的能力来发现软件漏洞以帮助防御者修补它们,还能找出可能被恶意行为者利用的漏洞利用方式。Anthropic 自身也在其神话5和克莱德·寓言5的发布中指出这一双刃剑特性。该公司在上周的一篇博客文章中写道:“人工智能模型的大量高级用途具有双重性:在网络安全专业人员和生物学研究人员手中有益的那些查询,如果被恶意行为者掌握,可能会变得危险。”

考虑到这一点,该公司最初将一款名为“神话预览版”的版本发布给了一个特定的联盟,该联盟隶属于一个名为“玻璃翼项目”的工作组。神话5上周也私下发布给了这个小组,而克莱德·寓言5——一款神话级别的模型——则向公众开放,但其在回答生物学和网络安全相关问题方面的能力被设置了特定限制。

随后,在上周末,特朗普政府采取行动限制这两款模型,原因在于它认为寓言5的安全护栏可以被禁用,从而允许完全访问神话5的能力,这据称构成了国家安全风险。

然而,专家表示,这种制度层面的冲突只是推迟或掩盖了一个严酷的事实:Anthropic 此时可能处于风口浪尖,但总体而言,人工智能能力以及来自多家公司和开放权重开发者的模型,在不久的将来几乎肯定——如果不是已经具备的话——会具备与神话5类似的能力。

专业网络安全咨询公司 TPO Group 的首席安全官塔拉·惠勒表示:“认为 Anthropic 的其他竞争对手不会开发出类似神话的能力,甚至认为它们还没有开发出来,这是极其短视的。还有其他公司紧追 Anthropic 其后,它们可能也拥有这些能力,并且正在观望 Anthropic 在当前监管环境下的处境,从而将能力保留备用。”

自神话预览版发布以来,Anthropic 自身也一直在强调这一点。该公司前沿红队负责人洛根·格雷厄姆在四月神话预览版发布时告诉《连线》杂志:“真正的信息是,这并不关乎模型本身或 Anthropic。我们需要现在就开始为一个在6个月、12个月、24个月内这些能力将广泛普及的世界做好准备。”

例如,OpenAI 也在四月中旬私下发布了一款专注于网络安全的模型,并宣布了扩展后的网络安全策略。

研究人员指出,即使在这一代新模型之前,现有的人工智能产品也能通过精细化的辅助工具,用于高级漏洞搜寻和漏洞利用开发。大量网络安全领导者在上周日的一封公开信中向政府强调了这一点,认为白宫的出口管制指令是误导性的。

哈佛大学和多伦多大学的研究员布鲁斯·施奈尔一直在分析这一情况,他说:“这并非单一模型的问题,而是技术的总体趋势。更小、更便宜、开源的模型——有时单独运作,有时相互协作——通过更复杂的提示工程,可以匹配神话/寓言的表现。我们应当预期,其他模型在几个月内就能达到神话/寓言的创造力和坚韧性——对于开源模型来说,时间可能会稍长一些。”

专家们表示,白宫和世界各国政府需要关注的重点,是以民主的方式制定更广泛、更透明的计划,来应对人工智能能力在网络安全及其他敏感领域不可避免的进步。

云安全公司 Veracode 的联合创始人克里斯·维索帕尔说:“政策问题不在于一项技术是否有风险。问题在于,一项具体的限制措施是在有意义地降低风险,还是主要拖慢了那些试图让系统更安全的人。”

英文来源:

Late last week, Anthropic took its new Claude Fable 5 and Mythos 5 AI models offline following a United States government export-control directive barring “any foreign national” from using the services. The company has been in talks with the White House since Friday but has yet to secure an agreement that would allow it to reinstate the offerings.
Since Mythos debuted in April, Anthropic has claimed—and warned—that the model has advanced capabilities for not only finding software vulnerabilities to help defenders patch them, but also figuring out ways to exploit them that could be used by bad actors. Anthropic itself noted this double edged sword in its launch of Mythos 5 and Claude Fable 5. “A great deal of advanced usage of AI models is dual use: the same queries that are beneficial in the hands of cybersecurity professionals and biology researchers could be dangerous if available to malicious actors,” the company wrote in a blog post last week.
With this in mind, the company initially released a version called Mythos Preview to a select consortium as part of a working group known as Project Glasswing. Mythos 5 was also privately released to this group last week, while Claude Fable 5, which is a Mythos-grade model, was released to the general public with specific blocks on its ability to give responses to questions about biology and cybersecurity.
Then, at the end of last week, the Trump administration moved to restrict both models because it believes that Fable 5’s guardrails can be disabled to allow full access to the Mythos 5 capabilities, allegedly making it a national security risk.
Experts say, though, that this institutional clash is simply delaying or masking a hard truth: Anthropic may be the tip of the spear in this moment, but AI capabilities in general and models from multiple companies and open-weight developers will almost certainly have similar capabilities to Mythos 5 in the near future—if they don't already.
“It's myopic in the extreme to think that no other competitors to Anthropic will develop similar capabilities to Mythos or even that they have not already done so,” says Tarah Wheeler, chief security officer of the specialized cybersecurity consulting firm TPO Group. “There are other companies hot on Anthropic's heels who probably have the capabilities, too, and are holding them in reserve as they see how Anthropic is being treated in the current regulatory environment.”
Anthropic itself has emphasized this point since the launch of Mythos Preview. “The real message is that this is not about the model or Anthropic,” Logan Graham, the company's frontier red team lead, told WIRED when Mythos Preview launched in April. “We need to prepare now for a world where these capabilities are broadly available in 6, 12, 24 months.”
OpenAI, for example, also did a private release of a cybersecurity-focused model in mid-April and announced an expanded cybersecurity strategy.
Researchers note that even before this next generation of models, existing AI offerings could be used for advanced vulnerability-hunting and exploit development with a refined harness. A large group of cybersecurity leaders emphasized this to the administration in an open letter on Sunday, arguing that the White House's export-control directive was misguided.
“It's not one model; it's the general trend of technology,” says Bruce Schneier, a researcher at Harvard University and the University of Toronto who has been analyzing the situation. “Smaller, cheaper, open-source models, sometimes by themselves and sometimes in concert with each other, can match Mythos/Fable's performance with more sophisticated prompting. And we should expect other models to match Mythos/Fable's creativity and tenaciousness within months—slightly longer for open-source models.”
What the White House and governments around the world need to focus on, experts say, is democratically developing much broader and more transparent plans for how they will contend with advances in AI capabilities on cybersecurity and in other sensitive areas as they inevitably occur.
“The policy question is not whether a technology has risk,” says Chris Wysopal, cofounder of the cloud security firm Veracode. “The question is whether a specific restriction meaningfully reduces that risk or whether it mainly slows down the people trying to make systems safer.”

连线杂志AI最前沿

文章目录


    扫描二维码,在手机上阅读