OpenAI与博通联合推出AI推理芯片

内容来源:https://aibusiness.com/generative-ai/openai-broadcom-introduce-ai-inference-chip
内容总结:
OpenAI联合博通发布自研AI推理芯片“Jalapeño”,剑指企业级Token成本痛点
(华盛顿讯)当地时间周三,生成式AI领军企业OpenAI与半导体巨头博通联合推出了一款名为“Jalapeño”的新型大语言模型(LLM)专用推理芯片。这是OpenAI首次自研处理器,标志着其从模型开发向全栈基础设施布局迈出关键一步,旨在通过降低推理成本,为担忧Token价格飞涨的企业客户提供新选择。
随着企业AI应用加速落地,Token消耗量激增导致成本结构急剧攀升。Gartner分析师Chirag Dekate指出,企业正面临Token经济带来的“爆炸性成本切换”。而Jalapeño芯片从设计之初就专注于优化所有LLM,通过减少数据移动并平衡计算、内存与网络资源,实现峰值性能。OpenAI计划在未来数月发布详细技术报告。
这一战略举措使OpenAI在与主要竞争对手Anthropic的赛跑中重夺主动权。此前Anthropic凭借高频次的模型迭代一度占据上风。如今,OpenAI不仅摆脱了对英伟达的单一依赖(合作方还包括Cerebras),更拥有了类似谷歌TPU、亚马逊Trainium和阿里云“镇岳M890”那样的底层自主权。分析认为,若模型市场陷入价格战,自研芯片将为OpenAI提供在不牺牲利润的前提下灵活调整定价的“缓冲空间”。
对博通而言,此次合作意味着其能以共同创新的姿态切入顶级模型厂商的供应链,重构“瓦特转Token”的经济模型,从而在更高效率与差异化定价中获益。
不过,Jalapeño的商业前景尚存迷雾。由于技术细节尚未公开,其系统集成能力及能否支撑多代芯片迭代仍是未知数。Dekate提醒,基础设施赛道极其艰难,需要长达数年的执行与规模化布局。但他同时指出:“这是前沿模型提供商必然走向的逻辑方向。可以预见,Anthropic很快也会推出类似产品。”
中文翻译:
由谷歌云赞助
选择您的首个生成式AI应用场景
要开始使用生成式AI,首先应聚焦于能够改善人类与信息交互体验的领域。
这款芯片将使AI模型制造商能够选择提供更低的令牌价格,或许能缓解企业对令牌成本上涨的担忧。
OpenAI与博通于周三联合推出一款针对大型语言模型优化的推理芯片。这家生成式AI供应商正试图降低企业的令牌成本,并在与竞争对手Anthropic的较量中占据优势。
这款名为Jalapeño的芯片是OpenAI的首个智能处理器,也是ChatGPT开发商构建其模型与产品全栈能力的第一步。OpenAI从设计之初就确保该芯片能适配所有大语言模型。该公司表示将在未来数月内发布详细的技术报告,并透露该芯片的架构能减少数据传输,平衡计算、内存与网络资源以达到峰值性能。
这款芯片是这家AI供应商的战略举措。此前,OpenAI已通过与英伟达、Cerebras及博通合作来推动基础设施多元化。如今,其自有芯片进一步提供了更多选择——尤其是在企业对推理成本和令牌价格上涨日益敏感的背景下。OpenAI也在近期首次重新夺回业界关注焦点,此前Anthropic凭借快速迭代的模型与服务一度领先。
“企业正面临与令牌经济学和令牌消耗相关的爆炸性成本变化,”高德纳分析师奇拉格·德卡特表示,“随着令牌消耗量攀升,成本结构也在膨胀。”
通过这项与博通的合作,OpenAI正试图吸引那些希望缓解令牌用量增长带来成本压力的企业。
“这使OpenAI能够改变‘将电能转化为令牌’的经济模型,同时转化为更高的利润效率与收益,”德卡特说,“如果模型生态陷入价格战,OpenAI将拥有更大的调整空间,在不影响盈利能力的前提下灵活定价。”
尽管OpenAI会根据实际需求继续使用英伟达或Cerebras的基础设施,但自研AI芯片使其获得了与谷歌TPU、AWS Trainium芯片及阿里巴巴真武M890芯片同等的灵活性。
对博通而言,这家半导体设计公司得以与头部模型企业合作共创。
“通过与OpenAI合作,博通能改变经济模式,以不同价格点实现更高效率,”德卡特表示。
尽管Jalapeño前景可期,但该芯片的技术细节尚未公开,其与现有系统的整合能力仍未知。更难以预测的是,博通与OpenAI能否将其打造为可持续发展的产品,并经历多代芯片设计的迭代。
“定制芯片的长期发展难以预测,”德卡特说,“涉足基础设施领域极其困难,需要具备多年规划、跨代执行与规模化拓展的全局视角。”
他补充道,面对英伟达与AMD这类资深芯片制造商,OpenAI与博通需找到维持芯片供应的可持续路径。
“现在下结论为时过早,但总体而言这是正确方向,”德卡特继续表示,“这是模型供应商的必然选择,尤其是OpenAI与Anthropic这样的前沿模型提供商。我预计Anthropic很快也会推出类似产品。”
英文来源:
Sponsored by Google Cloud
Choosing Your First Generative AI Use Cases
To get started with generative AI, first focus on areas that can improve human experiences with information.
The chip would give AI model makers the option to offer lower token prices, possibly easing the concerns of businesses worried about higher token costs.
OpenAI and Broadcom unveiled a new LLM-optimized inference chip on Wednesday, as the generative AI vendor seeks to reduce token costs for businesses and gain a competitive advantage over rival Anthropic.
The chip, named Jalapeño, is OpenAI’s first intelligence processor. It is the first step for the ChatGPT maker toward building a full stack behind its models and products. OpenAI designed the chip from the start to work with all LLMs. The vendor said it will release a detailed technical report on the model in the next few months but said the chip’s architecture reduces data movement and balances compute, memory and networking resources to achieve peak performance.
The chip is a strategic move for the AI vendor, which has been trying to diversify its infrastructure by partnering with Nvidia, Cerebras and now Broadcom. However, having its own chip now gives it options, as more businesses are conscious of inference costs and rising token prices. OpenAI, for the first time in a long while, also appears to have grabbed the spotlight from Anthropic, which has been releasing models and services rapidly and outpacing OpenAI.
“[Enterprises] are seeing explosive cost switches associated with tokenomics and token consumption,” said Chirag Dekate, an analyst at Gartner. “As the token consumption rises, their cost structures are increasing.”
With this deal with Broadcom, OpenAI is appealing to businesses looking for a way to ease the rising costs that come with the use of more tokens.
“This enables OpenAI to change the economics of converting watts to tokens, and that also translates to higher margin efficiencies and margin gains,” Dekate said. “If the model ecosystem turns into a price war, this gives OpenAI much more leeway and bandwidth to tweak and tune the pricing without affecting profitability.”
While OpenAI will use other underlying infrastructure from either Nvidia or Cerebras where it seems fit, having its own AI chip provides it with much of the flexibility that Google has with TPUs, AWS with its Trainium chips, and Alibaba with its Zhenwu M890 chip.
For Broadcom, the semiconductor designer gets to collaborate with a major model company and co-innovate with it.
“By Broadcom engaging in this, they’re able to change the economics, and they can position themselves to deliver better efficiency at a different price point,” Dekate said.
While Jalapeño appears promising, little is known about the chip, as its technical details have not yet been released. It is unclear how it will integrate with other systems. And it is unclear whether Broadcom and OpenAI will be able to make it a sustainable product capable of maturing through generations of chip design.
“The longevity of [custom-designed microchips] is hard to predict,” Dekate said. “Getting into infrastructure is really hard. You need to have a multiyear pipeline, multiyear, multi-generation view of executing and scaling.”
He added that with competitors Nvidia and AMD established as longtime chipmakers, OpenAI and Broadcom will need to find a way to sustain chip supply.
“It's too premature, but overall, this is a really good direction,” Dekate continued. “It is a logical direction for the model providers to take, especially frontier model providers like OpenAI and Anthropic. I would anticipate Anthropic will have something similar cooking up soon, too.”