“Maven计划”如何教会军方爱上人工智能

qimuai 发布于 阅读:15 一手编译

“Maven计划”如何教会军方爱上人工智能

内容来源:https://www.theverge.com/ai-artificial-intelligence/917996/project-maven-military-ai-katrina-manson

内容总结:

标题:美军AI系统加速战争进程:24小时内打击超千个目标,伊朗校园惨案敲响警钟

据最新披露,美军在对伊朗发动攻击的头24小时内,打击目标超过1000个,规模近乎20年前伊拉克“震慑与畏惧”行动的两倍。这一惊人提速的关键,正是被美国军方和北约采购的“马文智能系统”(Maven Smart System)。

从“看4%”到“打1000个”
该系统最初源于2017年的“马文计划”,旨在用人工智能分析无人机视频,解决美军此前仅能分析4%侦察数据的困境。如今,马文融合卫星图像、雷达、社交媒体等数十种数据源,将传统需要数小时的“杀伤链”流程压缩至数秒。美军官员透露,借助该系统,单日目标打击量已从不足100个跃升至1000个,而在加入大语言模型后,理论上可达5000个。

伊朗校园悲剧:数据库未更新,AI加速致命
然而,技术狂飙的代价触目惊心。对伊开战首日,一个曾被标记为伊朗海军基地、但实际已改为女子学校的建筑遭到打击,导致150多人死亡,其中大部分是儿童。尽管事后舆论聚焦于AI模型“幻觉”,但科技史学者凯文·贝克在《卫报》指出:“聊天机器人并未杀死那些孩子——是有人未更新数据库,而另有人构建了足够快的系统,让这个疏忽变得致命。”卫星图像清晰显示,该地标有操场,且在网络上被列为学校。

“白点”愿景与硅谷争议
项目核心推动者、海军陆战队情报军官德鲁·库科尔上校最初构想的是“白点”系统:在地图上无缝叠加情报信息(坐标、建筑属性、高度等),让前线士兵即时掌握战场态势。然而,项目自2017年起便充满争议:谷歌因员工大规模抗议退出,最终由Palantir接手,并整合微软、亚马逊、Anthropic等技术。美军内部伦理学家警告,过度依赖AI可能导致“战争游戏化”,操作者易盲目信任屏幕上的打击建议,忽视底层数据的不确定性。

乌克兰成试验场,目标定义被“模糊化”
马文系统在乌克兰迎来关键转折。2022年,美国第18空降军在德国基地用系统分析俄军阵地,起初算法无法识别雪地中的坦克,美方紧急收集新影像训练模型。随后,美方向乌方每日传送多达267个“兴趣点”(刻意规避“目标”一词,以免被视为直接参战)。乌军据此定位俄军装备与人员,加速了打击效率。

自主武器逼近:人类仅剩“按下按钮”权
尽管美军声称“最后决策者仍是人类”,但马文已将传统6个需人工介入的环节压缩至仅剩2个。更令人警惕的是,曼森在新书中揭露,美军正研发完全自主的武器系统——包括装载炸药的无人“水上摩托”,可自行寻找并摧毁目标。前国防部长马蒂斯警告:“打击不等同于战略,击中很多目标并不必然导向胜利。”

致命隐患:系统速度超越纠错能力
回顾1999年美军误炸中国驻南联盟大使馆事件,如今的技术反而可能放大风险:若目标数据库存在错误(如未更新的错误标注),AI系统会更快地基于虚假信息生成打击建议,而人类更难以在提速后的流程中及时核查纠错。正如贝克所言:“那些本可用于深思、发现矛盾情报的时间,已被算法填满。”

中文翻译:

在对伊朗发起攻击的最初24小时内,美军打击了超过1000个目标,其规模几乎是二十多年前伊拉克战争"震慑与恐惧"行动的两倍。实现这一加速的关键在于能够加快目标锁定流程的人工智能系统,其中最主要的是"马文智能系统"。

马文计划如何让军方爱上人工智能

一本新书揭示了这项备受争议的硅谷合作项目如何加快了战争节奏

在记者卡特里娜·曼森的新书《马文计划:一位海军陆战队上校、他的团队与人工智能战争的黎明》中,她深入探究了马文系统的发展历程——该系统始于2017年,最初是将计算机视觉应用于无人机影像的一项实验。该项目引发了军方初始承包商谷歌的员工抗议,迫使该公司退出。随后,在海军陆战队情报官德鲁·库科尔(其故事构成了本书的主线)的推动下,该系统最终由帕兰提尔公司构建,并整合了微软、亚马逊、Anthropic等公司开发的技术。如今,马文系统已广泛应用于美国各军种,并于近期被北约采购。它综合处理卫星图像、雷达数据、社交媒体信息及数十种其他数据源,以识别并瞄准战场上的目标。同时,它还加速了所谓的"杀伤链"。

马文系统将计算机视觉与一种工作流管理系统相结合,该系统能发现目标、为其匹配武器,并允许用户快速点击完成目标锁定周期中的后续步骤。过去需要数小时的过程,现在几秒钟即可完成。一位官员告诉曼森,这项技术使美国从每天打击不到一百个目标,提升到一千个,而随着大语言模型的加入,每天可打击的目标数高达五千个。

伊朗战争首日打击的一千个目标中,有一所女子学校,造成超过150人死亡,其中大部分是儿童。这所学校曾是伊朗海军基地的一部分,但在网上被列为学校,卫星图像中也能看到操场。尽管袭击后许多报道关注的是克劳德模型可能产生的"幻觉",但技术史学家凯文·贝克在《卫报》撰文指出,马文系统及其带来的加速效应才是更应关注的焦点。"杀死那些孩子的不是聊天机器人,"他写道,"是人们未能更新数据库,而另一些人又建造了一个足够快的系统,使得这一失误变得致命。"

战争的速度还将进一步加快。曼森揭露了军方正在开发的完全自主武器系统——包括一艘装载爆炸物的无人摩托艇——能够自主锁定并摧毁目标。

我与曼森就马文系统以及人工智能如何改变战争进行了对话。

以下访谈内容经过精简和编辑,以保证清晰度。

库科尔上校是人工智能早期且坚定的倡导者。能谈谈他吗?他最初的动机是什么?

他是马文计划的负责人,每天都投身其中并发挥领导作用,但他同时也拥有一个非常长远的愿景。这源于他的挫败感:阿富汗战场上的美军作战人员配备的情报工具非常落后。有一种观点认为,美国在阿富汗几乎是每六个月就把同一场战争重复打了40次,因为信息在部队轮换时没有得到有效交接。他感到沮丧的是,数据都被存放在Excel和PowerPoint里,他想要一个能将情报直接送到前线作战人员手中的分析工具。他还有一个关于"白点"的愿景——在地图上显示融合了情报信息的白色圆点,比如坐标、那里有什么、海拔高度、已知信息等。这成为他试图通过马文计划实现的核心驱动力之一。

马文系统最初在军方的构想中,是作为一个界面和信息管理系统吗?

它源于2017年启动的"马文计划"这个项目。当时该计划已存在并获得了资金。最初的目标是利用人工智能分析卫星图像,但后来转而用于无人机视频影像。这是因为美国当时正在思考如何开发用于人工智能的技术,以应对未来可能与中国的冲突。他们认为,未来的战争速度将快过人类思考的速度,因此希望将人工智能引入其中。库科尔上校最初提出的想法是将人工智能应用于无人机视频。当时,他们有时只能分析所采集数据的4%,所以他们本质上希望人工智能能取代人眼来分析所拍摄的内容,但其野心远不止于此。

公众首次听说马文系统是因为2018年的谷歌员工抗议事件。我记得谷歌当时表示这项技术不会被用于杀人。但听起来,锁定目标从一开始就是其意图?

谷歌当时的一位发言人表示,借助人工智能在无人机视频流中标出需要审核的图像,其目的是拯救生命,并且仅用于非攻击性用途。但我的报道所展现的情况并非如此。我的报道显示,许多美军作战人员的动机是拯救美国人的生命并减少平民伤害,从这个意义上说,它确实是"非攻击性的",因为你是在分析情报信息。但从更广泛的意义上说,而且很快,在非常现实的意义上,人工智能目标选择从一开始就是为锁定目标服务的。

我问过书中提到的一个人,锁定攻击性武器打击目标是否本应是马文计划的一部分,他回答说:"是的,当然,我们又不是为了好玩才这么做的。情报的目标就是消灭高价值目标。"

当谷歌的合作告吹后,帕兰提尔公司介入了。能讲讲帕兰提尔在这个项目中的作用吗?

发生了两件事。微软和亚马逊AWS在算法生产和算力方面开始扮演更重要的角色。与此同时,库科尔找到帕兰提尔,问:"你们能帮忙吗?" 他推销了屏幕上出现"白点"的想法。他有一个关于美军未来十年如何重塑自身的愿景。当时他们正在测试各种算法,但那些算法在那个阶段识别能力很差,而且还得运行在不适合的系统里。他们遇到了很多问题:用户不信任人工智能,并觉得显示效果非常分散注意力。所以,他想要一个能让用户满意的用户界面。

因此,他向帕兰提尔提出为他们创建一个用户界面,而帕兰提尔其实并不想做这件事。据我了解,他们当时不相信人工智能会腾飞,也不想仅仅做一个花哨的用户界面。他们想处理数据。但这并非库科尔最初向他们推销的内容,而且他非常有说服力。他还希望他们能不那么傲慢,并最终指导他们如何在国防部内部重塑声誉,赢得合同。起初,我认为这些合同金额并不高。但今天,将近十年后,我报道得知,马文智能系统将在九月底成为一项"备案项目",而帕兰提尔是主承包商,所以最终对他们来说将非常有利可图。

乌克兰听起来像是这些系统发展的一个相当大的转折点。那里发生了什么?

那是一个非常重要的时刻,炮兵火力小组意识到人工智能可以帮助他们加快行动和目标锁定速度。情报将直接用于作战行动,这一点变得前所未有的明确。在美国支持乌克兰时,甚至在俄军入侵之前,第18空降军就已经部署在德国的威斯巴登。他们很快开始在马文智能系统上使用计算机视觉,以确定俄军阵地、坦克的位置以及正在发生的情况。算法很快就失效了。这些算法原本适应中东和阿富汗的沙漠环境。它们无法识别雪地中的坦克和其他特征。美军收集了俄军坦克及其他装备上方的新卫星影像,迅速传回美国重新训练算法,使它们侦察坦克的能力大幅提升。

美国开始向乌克兰发送他们最终称之为"兴趣点"的信息,乌方随后利用这些信息来瞄准俄军装备和人员。"兴趣点"这个说法很有意思,因为美国试图在向乌克兰提供支持的同时,避免被俄罗斯视为战争的直接参与者。因此,他们发展出这样一个概念:"目标"(target)是经过特定流程处理后的产物,而他们提供给乌克兰的一切都只差这一步。据我报道,在2022年某一天的高峰期,美国向乌克兰传递了267个"兴趣点"。

目标锁定过程中的哪些部分正在被自动化,从而导致了这种加速?

美军方会声称目前还没有任何环节是自动化的,因为在目标锁定过程中还有一个额外的、非常关键的步骤——对打击目标做出法律层面的决定。关于杀伤链为何加速,我被告知,传统上,获得打击目标许可所涉及的许多流程都极其模拟和缓慢,需要打电话、转椅子等。因此,将这整个流程转移到数字平台上,并最终实现自动化,是加速的一部分。

第18空降军在六个关键步骤上都设置了人工环节。由人类决定何时以及如何攻击目标。他们评估所谓的"作战方案",评估收集到的数据,决定采取行动,传达决策,执行火力打击,然后通报结果。随着马文人工智能的到来,他们将人员介入的环节减少到只有两个:采取行动的决定和行动本身。在自动收集信息的过程中,人类可以监督机器的决策,但整个过程中的评估都是通过人工智能辅助完成的。甚至在NGA(美国国家地理空间情报局),他们正在生成完全由人工智能生成、没有任何人眼或人手接触过的情报报告。因此,这是一个巨大的转变,真正让数据和系统成为了主导。

他们一天之内能打击这么多目标的另一个原因是,马文智能系统使用了大型语言模型。我报道过他们使用了Anthropic公司的克劳德,据我所知,这有助于加快流程。美国中央司令部自己也表示,在人工智能的帮助下,他们能够将过去需要几天、几小时才能完成的流程缩短到几秒钟。美国方面会说,最终决定权仍在指挥官手中。但我也与美军方的伦理学家交流过,他们认为存在战争游戏化的风险,人们可能最终会完全信任屏幕上呈现的目标,而并未充分理解支持这些目标的数据。

对此的反驳是,这些数据的标注比以往任何时候都更好。这个基于人工智能的系统本质上是一个数据库系统,这意味着可以审计这些数据,深入探究,并为总部提供一种前所未有的方式,以更高的透明度和问责制来跟踪一线作战人员的行动。美国在伊朗进行的这次大规模行动最终将成为一个典型案例。我们将寻找关于美国最终如何使用这个平台的数据和问责机制。

有一位技术学者凯文·贝克,他写了一篇文章,谈到克劳德最初因为伊朗的学校袭击事件而承受了大量指责。但他指出了这种长期的加速趋势,并认为这些步骤可能挤占了原本用于深思熟虑、发现错误或处理矛盾情报的时间。我很好奇,军方内部是否有人担心事情进展得太快了?

美军内部正在进行一场非常重要的辩论,讨论他们应该多大程度上依赖这个系统。一些人认为这是不可避免的,另一些人则强烈警告,最后一刻的人工判断才是能够挽救生命的关键。我不认为这场辩论已经有了结果,但前进的方向很明确,因为马文智能系统即将成为备案项目。那位中央司令部指挥官会从作战中抽身,在X平台上发帖说他们正在使用人工智能,并发现它很有帮助。同时,像前国防部长吉姆·马蒂斯这样的人则说,目标锁定不能替代战略,打击很多东西并不能带来胜利。

我脑海中一直有一个例子,就是1999年美国轰炸贝尔格莱德的中国大使馆事件。事后美国公开的分析称,大使馆在地图上的标注是错误的。大使馆最近搬迁了,有些地图更新了,有些没有。甚至有人因为担心而想打电话核实,但没能及时联系到相关人员。

在类似这样的案例中,如果你的系统能够标记出问题,并且它们是数字化连接的,那么一方面,识别异常、问题和错误风险可能变得更容易。另一方面,基于一个可能有误的目标锁定数据库进行的目标选择,可能在没有这些检查的情况下被更快地执行。因此,美军在目标锁定周期中依赖人工智能的程度能有多深,最终将取决于为其提供数据的质量。

英文来源:

In the first 24 hours of the assault on Iran, the US military struck more than 1,000 targets, nearly double the scale of the “shock and awe” attack on Iraq over two decades ago. This acceleration was made possible by AI systems that speed up the targeting process. Chief among them is the Maven Smart System.
How Project Maven taught the military to love AI
A new book shows how the controversial Silicon Valley partnership has accelerated the pace of war
How Project Maven taught the military to love AI
A new book shows how the controversial Silicon Valley partnership has accelerated the pace of war
In her new book, Project Maven: A Marine Colonel, His Team, and the Dawn of AI Warfare, journalist Katrina Manson investigates the development of Maven from its inception in 2017 as an experiment in applying computer vision to drone footage. The project spurred employee protests at Google, the military’s initial contractor, prompting the company to back out. Pushed forward by a Marine intelligence officer named Drew Cukor, whose story forms the backbone of Project Maven, the system ended up being built by Palantir and draws on technologies developed by Microsoft, Amazon, Anthropic, and others. Now used across the US armed forces and recently purchased by NATO, Maven synthesizes satellite imagery, radar, social media, and dozens of other data sources to identify and target entities on the battlefield. It also speeds up what’s called the “kill chain.”
Maven combines computer vision with a sort of workflow management system that finds targets, pairs them with weapons, and allows users to quickly click through the other steps of a targeting cycle. A process that once took hours can now be completed in seconds. An official tells Manson that the technology has allowed the US to go from hitting under a hundred targets a day to a thousand, and with the addition of LLMs, up to five thousand targets a day.
One of the thousand targets struck on the first day of the Iran war was a girls’ school, killing more than 150 people, mostly children. The school had previously been part of an Iranian naval base, yet it was listed online as a school and playgrounds were visible on satellite imagery. While much of the coverage after the strike focused on possible hallucinations by Claude, the technology historian Kevin Baker wrote in The Guardian that Maven and the acceleration it enabled is the more relevant place to look. “A chatbot did not kill those children,” he wrote. “People failed to update a database, and other people built a system fast enough to make that failure lethal.”
The pace of war is set to accelerate further. Manson uncovers military programs to develop fully autonomous weapons — including an explosive-laden drone Jet Ski — capable of targeting and destroying targets on their own.
I spoke to Manson about Maven and how AI is changing warfare.
This interview has been condensed and edited for clarity.
Colonel Cukor was an early and determined proponent of AI. Can you say a bit about him and what his initial motivations were?
He is chief of Project Maven, so he was the day-to-day doer and leader, but he also had this very long-term vision, which comes from his frustration that US military operators in Afghanistan were equipped with very poor intelligence tools. There was this idea that the US essentially fought that war 40 times over, every six months, because information wasn’t being handed over [when troops rotated in]. He was frustrated that data was in Excel and PowerPoint and he wanted an analytic tool that would bring intelligence to the frontline military operators. But he also had this vision for what he called “white dots” — that there would be white dots shown on a map infused with intelligence information, like a coordinate, what is there, the elevation, what is known about it. And this becomes one of the driving forces of what he tries to create through Project Maven.
How was Maven initially conceived in the military, was it as this interface and information management system?
It comes out of this project called Project Maven that starts in 2017. The actual project already existed and had already got a funding stream. It was to use AI against satellite imagery, but then it got repurposed for drone video imagery. This is because the US is thinking about how to develop AI for technologies for any potential conflict against China. They had this idea that eventually war would run faster than humans could think, so they wanted to bring AI into this. The initial idea proposed by Colonel Cukor is to apply AI to drone video footage. They were sometimes managing to analyze as little as 4 percent of the collection, so they wanted AI essentially to take the place of human eyes in analyzing what was there, but it was always bigger.
The public first heard about Maven with the Google protests in 2018, and I remember Google at the time saying that this technology would not be used to kill people. But it sounds like targeting was always the intention?
A spokesperson from Google at the time said that flagging images for review on the drone feed with the help of AI was intended to save lives and was for non-offensive uses only. That is not what my reporting shows. My reporting shows that many of the US military operators were motivated by the aim to save US lives and reduce civilian harm, so in that sense, it is “not offensive” because you’re analyzing intelligence information. But in the wider sense and very quickly, in the very real sense, AI target selection was intended for targeting.
I asked someone in the book if targeting offensive weapon strikes were intended to be part of Project Maven, and he replied, “yeah, of course, it’s not like we’re doing it for kicks. The goal of the intel is to take out high-value targets.”
When the Google deal falls apart, that’s when Palantir steps in. Can you tell me about Palantir’s role in the project?
Two things happen. Microsoft and AWS [Amazon Web Services] take a much bigger role in producing the algorithms and also in the compute, and alongside that, Cukor goes to Palantir and says, “Can you help?” He’s pitching this idea of the white dots on a screen. He has this 10-year vision for how the US military will remake themselves, and they’ve been trying out algorithms, which at that stage are not very good at identifying anything, and are also having to sit in systems that aren’t fit for purpose. They had a lot of problems with users not believing in AI and finding the displays very distracting. So he wants a user interface that will please the user.
So he pitches to Palantir that they create a user interface, which actually Palantir doesn’t want to do. I’m told they didn’t believe that AI was going to take off, and they also didn’t want to just make a fancy user interface. They wanted to crunch the data. But that wasn’t initially what Cukor was pitching them and he was very persuasive. He also wanted them to be less arrogant, and he ends up counseling them on how to attempt to remake their reputation inside the Department of Defense and to get these contracts, which initially, I don’t think are worth much money. But today, nearly 10 years later, I’ve reported that Maven Smart System is going to become by the end of September a “program of record” and Palantir is the prime contractor, so in the end, it’s going to be lucrative for them.
Ukraine sounded like a pretty big inflection point in the development of these systems. What happened there?
This becomes a really important moment where the artillery fire team realizes that AI can help them speed up their operations and targeting. It becomes much more explicit that intelligence is going to feed into operations. When the US is supporting Ukraine, even before the invasion of Russia, the 18th Airborne Corps is over in Wiesbaden in Germany and very quickly they start to use computer vision on the Maven Smart System to figure out where the Russian positions are, where the tanks are, what is happening. The algorithms fail very quickly. The algorithms were used to the desert in the Middle East and in Afghanistan. The algorithms couldn’t recognize tanks and other features in the snow. They collect new satellite footage over the Russian tanks and other equipment and send them back to the US to retrain the algorithms really quickly, so they become much better at spotting tanks.
The US starts sending what they end up calling “points of interest” to the Ukrainians, who then use that to target Russian equipment and personnel. The language of “points of interest” is interesting because the US is trying to thread this needle to provide support to the Ukrainians without becoming seen in Russia’s eyes as a direct participant in the war. So they evolved this idea that a “target” is something that has gone through a process, and they are giving the Ukrainians everything just shy of that. I’m able to report that at the high point on one day in 2022, the US passes 267 points of interest to the Ukraine.
What are the parts of the targeting process that are getting automated that cause that kind of acceleration?
The US military would say nothing is yet automated, because there is this extra stage of targeting, which is really key, which is the legal decision to strike something. In the case of why the kill chain is speeding up, what I’ve been told is that a lot of the processes involved in getting permission to strike a target have traditionally been extremely analog and slow, involving telephones and swivel chairs. So this is part of shifting this process onto digital platforms and then eventually getting to automate it.
The 18th Airborne Corps had humans at six key steps. So the human decides when and how to shoot at a target. They assess what’s called an operational approach. They assess the data collected, they decide to act, communicate the decision, execute the fire, and then communicate what happened. And then with the arrival of Maven’s AI, they reduced the human role in the loop to only two places: the decision to act and the action itself. They can supervise the machine making the decision during the automated collection process, but the assessments throughout would all be AI enabled. Even at the NGA [National Geospatial-Intelligence Agency], they are producing intelligence reports that no human eyes or hands have touched that are entirely AI generated. So there’s been this huge shift into really making data and the system king.
The other reason that they’re able to get to so many targets in a day is because the Maven Smart System is using large language models. I’ve reported [they’re using] Claude from Anthropic, and I was told it was helping speed up the processes. And Centcom [US Central Command] themselves said that with the help of AI, they were able to speed up processes that used to take days and hours down to as little as seconds. The commander, the US would say, is still making the decision. But I’ve also spoken to US military ethicists who say that there is a risk of the gamification of war, and that people may end up trusting the targets that they’re being offered on screen without understanding fully the data that’s supporting it.
Now, the pushback is that this is data that’s better tagged than ever been before, that this AI-based system, essentially being a database system, means that you can audit the data and go deep into it and also give headquarters a way of following what military operators at the edge are doing with much greater transparency and accountability than ever before. This enormous operation that the US has undertaken in Iran will ultimately be a case in point. And we’ll be looking for data and accountability about how the US has, in the end, used this platform.
There’s a technology scholar, Kevin Baker, who wrote a piece about how Claude got a lot of blame initially for the school strike in Iran. But he pointed to this longer term acceleration and said that these steps may have left time for deliberation or noticing errors or contradictory intelligence. I’m curious if there were concerns in the military that things were getting too fast?
There’s a really significant debate inside the US military about how far they should lean into this. Some are saying it’s inevitable, and others are really warning that that human assessment at the last minute is the thing that can save lives. And I don’t think that the debates proved out, but the direction of travel is clear in that the Maven Smart System is becoming a program of record. That Central Command commander is taking time out of these operations to go on to X and say that they are using AI and that they’re finding it helpful. Then you have people like retired Defense Secretary Jim Mattis saying that targeting is no substitute for strategy, that hitting a lot of things, essentially, doesn’t get you to victory.
There’s one example that I keep going back to in my mind, which is in 1999, when the US strikes the Chinese Embassy in Belgrade. In the analysis that the US offers publicly afterwards, they say that the embassy was incorrectly labeled on a map. The embassy had moved recently. The map hadn’t been updated. One map had; others hadn’t. Someone even tried to make a call because they got worried and wanted to check, but they weren’t able to reach someone in time.
In an example like that, if your systems flag a problem and they’re digitally connected, on the one hand, it could be much easier to raise anomalies, problems, risks of mistake. On the other, the target selection from what could be an erroneous targeting database could be made even quicker without those checks. So the decision that the US military makes about leaning into AI on the targeting cycle will only be as good as the data that is feeding it.

ThevergeAI大爆炸

文章目录


    扫描二维码,在手机上阅读