DeepSeek新模型V4重要的三个原因

qimuai 发布于 阅读:17 一手编译

DeepSeek新模型V4重要的三个原因

内容来源:https://www.technologyreview.com/2026/04/24/1136422/why-deepseeks-v4-matters/

内容总结:

中国AI公司DeepSeek发布新一代V4模型,三大看点引关注

中国人工智能初创公司DeepSeek于4月18日正式发布了备受期待的新一代旗舰模型V4预览版。这是继今年1月凭借高性能与低成本震惊全球AI界的R1模型之后,DeepSeek最重磅的一次发布。尽管V4可能无法重现R1的轰动效应,但其在开源性能、内存效率以及芯片自主化三大领域的突破,意义深远。

看点一:开源模型性能再攀高峰,成本仅为美国同行的零头

V4延续了DeepSeek的开源传统,分为两个版本:专注于编程与复杂任务的V4-Pro,以及更快速、更经济的V4-Flash。DeepSeek声称,V4-Pro在关键基准测试中,性能可与Anthropic的Claude-Opus-4.6、OpenAI的GPT-5.4以及谷歌的Gemini-3.1等顶级闭源模型相媲美。在数学、编程和STEM问题上,V4更是全面超越了阿里通义千问Qwen-3.5等开源模型。更具颠覆性的是其定价:V4-Pro每百万输入token仅收费1.74美元,输出token收费3.48美元,仅为美国同类模型的几分之一;而V4-Flash更是低至约0.14美元和0.28美元,成为性价比最高的顶级模型之一。公司内部调查显示,超过90%的资深开发者在编程任务中将V4-Pro列为首选。

看点二:创新记忆压缩技术,长文本处理效率惊人

V4的另一大核心技术突破是其超长上下文窗口——可一次性处理高达100万token,相当于《指环王》三部曲与《霍比特人》的总篇幅。关键在于技术路线:DeepSeek并非简单堆料,而是对模型的注意力机制进行了根本性创新。通过选择性压缩旧信息并聚焦当前相关重点,V4在处理百万token长文本时,其计算量仅为上一代V3.2的27%,内存消耗降至10%。V4-Flash更是将这两项指标分别压缩到10%和7%。这一突破将极大降低需要处理海量资料的AI工具(如全量代码库阅读助手或长篇文档分析代理)的使用成本。

看点三:首次适配国产芯片,拉开脱离英伟达的序幕

V4是DeepSeek首款针对华为昇腾等国产芯片进行优化的模型。据报道,DeepSeek并未向英伟达、AMD等美国芯片制造商提供V4的早期访问权限,而是仅向中国芯片企业开放。华为已宣布其基于昇腾950系列的超级节点产品将支持V4。此举背后是美国自2022年以来对华芯片出口管制持续收紧的大背景,中国政府正全力推动从芯片到软件的国产AI技术栈建设。尽管目前V4可能仍主要依赖英伟达芯片进行训练,但在推理环节已成功适配国产芯片。DeepSeek甚至暗示,随着华为昇腾950超级节点在今年下半年大规模出货,V4-Pro的价格有望进一步显著下降。这标志着中国AI产业在摆脱对美国芯片依赖的道路上迈出了从“0到1”的关键一步。

中文翻译:

深度求索新模型V4备受瞩目的三大原因
这款期待已久的V4模型效率更高,也是中国芯片制造商的胜利。
上周五,中国人工智能公司深度求索发布了其备受期待的新旗舰模型V4的预览版。值得注意的是,得益于一项新设计使其能更高效地处理大量文本,该模型处理提示词的长度远超上一代。与深度求索之前的模型一样,V4是开源的,这意味着任何人都可以下载、使用和修改。
V4是自2025年1月深度求索推出推理模型R1以来最重要的一次发布。R1在有限的计算资源上训练而成,却凭借其强大的性能和效率震惊了全球人工智能行业,几乎一夜之间将深度求索从一个名不见经传的研究团队转变为中国最知名的人工智能公司。它还协助引发了中国其他人工智能公司发布开源权重模型的浪潮。
自那以后,深度求索一直保持相对低调——但本月早些时候,它通过为其在线版模型增加“专家”和“快速”模式,实际上是在暗示V4即将发布,引发了外界猜测这些更新与即将到来的重大发布有关。
尽管该公司已成为中国人工智能雄心的有力象征,但其重返前沿模型领域却是在数月的审视之后——包括重要人员离职、此前模型发布的延迟,以及来自中美两国政府日益严格的审查。
那么,V4会像R1那样撼动人工智能领域吗?几乎可以肯定地说不会,但这次的发布之所以重要,有以下三大原因。
1. 它为开源模型开辟了新天地。
就像之前的R1一样,深度求索声称V4的性能与目前最好的模型不相上下,但价格却只是其零头。这对开发者和使用该技术的公司来说是个好消息,因为这意味着他们可以按照自己的方式获得前沿的人工智能能力,而无需担心成本飙升。
新模型有两个版本,均可在深度求索的网站和应用程序中使用,API接口也对开发者开放。V4-Pro是一个更大的模型,专为编程和复杂智能体任务而设计;而V4-Flash则是一个较小的版本,旨在运行速度更快、成本更低。两个版本都提供推理模式,该模式下模型可以仔细解析用户的提示,并在处理问题时逐步显示每个步骤。
对于V4-Pro,深度求索的收费为每百万输入词元1.74美元,每百万输出词元3.48美元,仅是OpenAI和Anthropic同类模型成本的一小部分。V4-Flash甚至更便宜,每百万输入词元约0.14美元,每百万输出词元约0.28美元,使其成为可用的最便宜的顶级模型之一。这将使其成为构建应用程序的非常有吸引力的模型。
在性能方面,V4相比R1有了巨大的飞跃,这或许并不令人意外——而且它似乎是几乎所有最新大型人工智能模型的强劲替代品。根据该公司分享的结果,在主要基准测试中,深度求索V4-Pro可与领先的闭源模型竞争,其性能与Anthropic的Claude-Opus-4.6、OpenAI的GPT-5.4和谷歌的Gemini-3.1相当。与其他开源模型(如阿里巴巴的Qwen-3.5或Z.ai的GLM-5.1)相比,深度求索V4在编程、数学和STEM问题方面均超越了它们,使其成为有史以来发布的最强大的开源模型之一。
深度求索还表示,V4-Pro目前在针对智能体编程任务的基准测试中跻身最强开源模型之列,并且在其他衡量解决多步骤问题能力的测试中表现良好。根据该公司分享的基准测试结果,其写作能力和世界知识也处于领先地位。
在随模型发布的技术报告中,深度求索分享了针对85名经验丰富的开发者的内部调查结果:超过90%的开发者将V4-Pro列为编程任务的首选模型之一。
深度求索表示,它特别针对流行的智能体框架(如Claude Code、OpenClaw和CodeBuddy)优化了V4。
2. 它实现了一种新的内存效率方法。
V4的关键创新之一是其长上下文窗口——模型可以一次性处理的文本量。两个版本都可以处理100万个词元,这足以容纳《指环王》和《霍比特人》三部曲的总和。该公司表示,这个上下文窗口大小现在是所有深度求索服务的默认设置,与Gemini和Claude等模型的尖端版本提供的功能相匹配。
但重要的不仅仅是知道深度求索实现了这一飞跃,而是了解它是如何做到的。V4对其以前的模型进行了重大的架构更改——尤其是在注意力机制方面,这是人工智能模型的一个特性,有助于模型理解提示中每个部分与其余部分的关系。随着提示文本变长,这些比较的成本变得非常高,使得注意力成为长上下文模型的主要瓶颈之一。
深度求索的创新在于让模型对需要关注的内容更具选择性。V4没有将之前的所有文本都视为同等重要,而是压缩旧信息并专注于当前时刻最可能相关的部分,同时仍然完整保留附近文本,以免错过重要细节。
深度求索表示,这大大降低了使用长上下文的成本。在100万词元的上下文中,V4-Pro仅使用其先前模型V3.2所需计算能力的27%,同时将内存使用量削减至10%。V4-Flash的减少幅度更大,仅使用10%的计算能力和7%的内存。在实践中,这可能会降低构建需要处理大量材料的工具的成本,例如能够阅读整个代码库的人工智能编程助手,或能够分析长文档档案而不会不断忘记之前内容的研究智能体。
深度求索对长上下文窗口的兴趣并非始于V4。在过去一年半的时间里,该公司悄然发表了关于人工智能模型如何“记住”信息的系列论文,尝试使用压缩和数学技术来扩展人工智能模型能够实际处理的内容。
3. 它标志着摆脱英伟达的艰难道路迈出了第一步。
V4是深度求索第一款针对华为昇腾等中国国产芯片进行优化的模型——此举使得这次发布某种程度上成为对中国本土人工智能产业能否开始放松对美国芯片巨头英伟达依赖的一次检验。
这在很大程度上是意料之中的,因为《信息》杂志本月早些时候报道,深度求索并未给予英伟达和AMD等美国芯片制造商对V4的早期访问权,尽管预发布访问权限很常见,以便芯片制造商在发布前优化对新模型的支持。据报道,该公司只给予了国产芯片制造商早期访问权限。
上周五,华为表示其基于昇腾950系列的昇腾超节点产品将支持深度求索V4。这意味着希望运行自己修改版深度求索V4的公司和个人将能够轻松使用华为芯片。
路透社此前报道称,中国政府官员建议深度求索在其训练过程中集成华为芯片。这种压力符合中国产业政策的一个更广泛模式:战略领域通常会被推动,有时甚至被有效要求,以实现国家自给自足的目标。但在人工智能方面情况尤为紧迫。自2022年以来,美国的出口管制切断了中国公司获取英伟达最强大芯片的途径,随后还限制了对中国市场降级版本的访问。北京的回应是加速推动构建本土的人工智能技术栈,从芯片到软件框架再到数据中心。
据报道,中国当局一直在推动数据中心和公共计算项目使用更多国产芯片,包括据报道禁止使用外国制造的芯片、设定采购配额,以及要求将英伟达芯片与华为和寒武纪等公司的国产替代品搭配使用。
尽管如此,取代英伟达并非像用一种芯片替换另一种芯片那么简单。英伟达的优势不仅在于其芯片,还在于开发者多年来围绕这些芯片构建的软件生态系统。转向华为的昇腾芯片意味着调整模型代码、重建工具,并证明围绕这些芯片构建的系统足够稳定,可以用于严肃用途。
需要明确的是,深度求索似乎并未完全脱离英伟达。该公司的技术报告显示,它使用中国芯片来运行模型进行推理,或者当有人要求模型完成任务时。但清华大学计算机科学教授刘知远告诉《麻省理工科技评论》,深度求索似乎只将V4的部分训练过程适配到了中国芯片上。该报告并未说明一些关键的长上下文功能是否适配到了国产芯片,因此刘知远表示V4可能仍主要是在英伟达芯片上训练的。由于这些问题的政治敏感性,多位不愿具名的消息人士告诉《麻省理工科技评论》,中国芯片的性能仍然不如英伟达芯片,但更适合推理而非训练。
深度求索还将V4未来的成本与这种硬件转变挂钩。该公司表示,在华为昇腾950超节点于今年下半年开始大规模出货后,V4-Pro的价格可能会大幅下降。
如果成功,V4可能是中国正在成功构建平行人工智能基础设施的一个早期信号。
深度解析
人工智能
OpenAI正全力以赴打造全自动研究员
与OpenAI首席科学家Jakub Pachocki就公司新的宏大挑战及人工智能未来进行独家对话。
《精灵宝可梦Go》如何为送货机器人提供精确到英寸的世界视图
独家报道:Niantic的人工智能衍生公司正在利用从玩家众包的300亿张城市地标图像训练一个新的世界模型。
想了解人工智能的现状?看看这些图表。
根据斯坦福大学2026年人工智能指数,人工智能正在飞速发展,我们很难跟上步伐。
这家初创公司想要改变数学家做数学的方式
Axiom Math公司正在免费提供一个强大的人工智能新工具。但它能否像公司希望的那样加速研究进程,仍有待观察。
保持联系
获取来自《麻省理工科技评论》的最新资讯
发现特别优惠、热门故事、即将举行的活动等更多内容。

英文来源:

Three reasons why DeepSeek’s new model V4 matters
The long-awaited V4 model is more efficient and a win for Chinese chipmakers.
On Friday, Chinese AI firm DeepSeek released a preview of V4, its long-awaited new flagship model. Notably, the model can process much longer prompts than its last generation, thanks to a new design that helps it handle large amounts of text more efficiently. Like DeepSeek’s previous models, V4 is open source, meaning it is available for anyone to download, use, and modify.
V4 marks DeepSeek’s most significant release since R1, the reasoning model it launched in January 2025. R1, which was trained on limited computing resources, stunned the global AI industry with its strong performance and efficiency, turning DeepSeek from a little-known research team into China’s best-known AI company almost overnight. It also helped set off a wave of open-weight model releases from other Chinese AI firms.
DeepSeek has kept a relatively low profile since then—but earlier this month, it effectively teased V4’s release when it added “expert” and “flash” modes to the online version of its model, prompting speculation that the updates were tied to a bigger upcoming release.
While the company has become a powerful symbol of China’s AI ambitions, its big return to cutting-edge frontier models comes after months of scrutiny—including major personnel departures, delays to previous model launches, and growing scrutiny from both the US and Chinese governments.
So, will V4 shake the AI field the way R1 did? Almost certainly not, but here are three big reasons why this release matters.

  1. It breaks new ground for an open-source model.
    As with R1 before it, DeepSeek claims that V4’s performance rivals the best models available at a fraction of the price. This is great news for developers and for companies using the tech, because it means they can access frontier AI capabilities on their own terms, and without worrying about skyrocketing costs.
    The new model comes in two versions, both of which are available on DeepSeek’s website and in its app, with API access also open to developers. V4-Pro is a larger model built for coding and complex agent tasks, and V4-Flash is a smaller version designed to be faster and cheaper to run. Both versions offer reasoning modes, in which the model can carefully parse a user’s prompt and show each step as it works through the problem.
    For V4-Pro, DeepSeek charges $1.74 per million input tokens and $3.48 per million output tokens, a fraction of the cost of comparable models from OpenAI and Anthropic. V4-Flash is even cheaper, at about $0.14 per million input tokens and about $0.28 per million output tokens, making it one of the cheapest top-tier models available. This would make it a very appealing model to build applications on.
    In terms of performance, V4 is, perhaps unsurprisingly, a huge jump from R1—and it seems to be a strong alternative to just about all the latest big AI models. On the major benchmarks, according to results shared by the company, DeepSeek V4-Pro competes with leading closed-source models, matching the performance of Anthropic’s Claude-Opus-4.6, OpenAI’s GPT-5.4, and Google’s Gemini-3.1. And compared to other open-source models, such as Alibaba’s Qwen-3.5 or Z.ai’s GLM-5.1, DeepSeek V4 exceeds them all on coding, math, and STEM problems, making it one of the strongest open-source models ever released.
    DeepSeek also says that V4-Pro now ranks among the strongest open-source models on benchmarks for agentic coding tasks and performs well on other tests that measure ability to carry out multistep problems. Its writing ability and world knowledge also leads the field, according to benchmarking results shared by the company.
    In a technical report released alongside the model, DeepSeek shared results from an internal survey of 85 experienced developers: More than 90% included V4-Pro among their top model choices for coding tasks.
    DeepSeek says it has specifically optimized V4 for popular agent frameworks such as Claude Code, OpenClaw, and CodeBuddy.
  2. It delivers on a new approach to memory efficiency.
    One of the key innovations of V4 is its long context window—the amount of text the model can process at once. Both versions can handle 1 million tokens, which is large enough to fit all three volumes of The Lord of the Rings and The Hobbit combined. The company says this context window size is now the default across all DeepSeek services and it matches what is offered by cutting-edge versions of models like Gemini and Claude.
    But it’s important to know not just that DeepSeek has made this leap, but how it did so. V4 makes significant architectural changes to the company’s former models—especially in the attention mechanism, which is the feature of AI models that helps them understand each part of a prompt in relation to the rest. As the prompt text gets longer, these comparisons become much more costly, making attention one of the main bottlenecks for long-context models.
    DeepSeek’s innovation was to make the model more selective about what it pays attention to. Instead of treating all earlier text as equally important, V4 compresses older information and focuses on the parts most likely to matter in the present moment, while still keeping nearby text in full so it does not miss important details.
    DeepSeek says this sharply reduces the cost of using long context. In a 1-million-token context, V4-Pro uses only 27% of the computing power required by its previous model, V3.2, while cutting memory use to 10%. The reduction in V4-Flash is even larger, using just 10% of the computing power and 7% of the memory. In practice, this could make it cheaper to build tools that need to work across huge amounts of material, such as an AI coding assistant that can read an entire codebase or a research agent that can analyze a long archive of documents without constantly forgetting what came before.
    DeepSeek’s interest in long context windows didn’t start with V4. Over the past year and a half, the company has quietly published a series of papers on how AI models “remember” information, experimenting with compression and mathematical techniques to extend what AI models could realistically handle.
  3. It marks the first steps on the hard road away from Nvidia.
    V4 is DeepSeek’s first model optimized for domestic Chinese chips, such as Huawei’s Ascend—a move that has turned the launch into something of a test of whether China’s homegrown AI industry can begin to loosen its dependence on US chip giant Nvidia.
    This was largely expected, since The Information reported earlier this month that DeepSeek did not give American chipmakers like Nvidia and AMD early access to V4, though prerelease access is common to allow chipmakers to optimize support of the new model ahead of a launch. Instead, the company reportedly gave early access only to Chinese chipmakers.
    On Friday, Huawei said its Ascend supernode products, based on the Ascend 950 series, would support DeepSeek V4. This means that companies and individuals who want to run their own modified version of Deepseek V4 will be able to use Huawei chips easily.
    Reuters previously reported that Chinese government officials recommended that DeepSeek integrate Huawei chips in its training process. And this pressure fits a broader pattern in China’s industrial policy: Strategic sectors are often pushed, and sometimes effectively required, to align with national self-reliance goals. But there’s a particular urgency when it comes to AI. Since 2022, US export controls have cut Chinese firms off from Nvidia’s most powerful chips, and they later also restricted access to downgraded China-market versions. Beijing’s response has been to accelerate the push for a domestic AI stack, from chips to software frameworks to data centers.
    Chinese authorities have reportedly been pushing data centers and public computing projects to use more domestic chips, including through reported bans on foreign-made chips, sourcing quotas, and requirements to pair Nvidia chips with Chinese alternatives from companies such as Huawei and Cambricon.
    Still, replacing Nvidia is not as simple as swapping one chip for another. Nvidia’s advantage lies not only in its chips, but in the software ecosystem developers have spent years building around them. Moving to Huawei’s Ascend chips means adapting model code, rebuilding tools, and proving that systems built around those chips are stable enough for serious use.
    To be clear, DeepSeek does not appear to have fully moved beyond Nvidia. The company’s technical report reveals that it is using Chinese chips to run the model for inference, or when someone asks the model to complete a task. But Liu Zhiyuan, a computer science professor at Tsinghua University, told MIT Technology Review that DeepSeek appears to have adapted only part of V4’s training process for Chinese chips. The report does not say whether some key long-context features were adapted to domestic chips, so Liu says V4 may still have been trained mainly on Nvidia chips. Multiple sources who spoke on the condition of anonymity, due to political sensitivity around these issues, told MIT Technology Review that Chinese chips still don’t perform as well as Nvidia chips but are better suited for inference than training.
    DeepSeek is also tying the future costs of V4 to this hardware shift. The company says V4-Pro prices could fall significantly after Huawei’s Ascend 950 supernodes begin shipping at scale in the second half of this year.
    If that works, V4 could be an early sign that China is successfully building a parallel AI infrastructure.
    Deep Dive
    Artificial intelligence
    OpenAI is throwing everything into building a fully automated researcher
    An exclusive conversation with OpenAI’s chief scientist, Jakub Pachocki, about his firm's new grand challenge and the future of AI.
    How Pokémon Go is giving delivery robots an inch-perfect view of the world
    Exclusive: Niantic's AI spinout is training a new world model using 30 billion images of urban landmarks crowdsourced from players.
    Want to understand the current state of AI? Check out these charts.
    According to Stanford’s 2026 AI Index, AI is sprinting, and we’re struggling to keep up.
    This startup wants to change how mathematicians do math
    Axiom Math is giving away a powerful new AI tool. But it remains to be seen if it speeds up research as much as the company hopes.
    Stay connected
    Get the latest updates from
    MIT Technology Review
    Discover special offers, top stories, upcoming events, and more.

MIT科技评论

文章目录


    扫描二维码,在手机上阅读