氛围编码硬件

内容总结:
“共鸣编程”颠覆硬件研发:软件工程师搭框架,硬件工程师“写代码”
在超音速飞机制造商Boom Supersonic,一场由AI驱动的硬件研发革命正在悄然发生。公司创始人Blake Scholl指出,传统硬件工程长期依赖孤岛式的Excel表格和VBScript代码,缺乏版本控制与自动化测试,数据交接甚至停留在上世纪90年代的电邮手动传递模式。
如今,这一低效流程被彻底改写。软件工程师负责搭建系统架构与算法框架,硬件工程师则通过“共鸣编程”(Vibe Coding)完成各自专业模块的代码编写。Scholl以涡轮叶片设计为例:传统模式下,一名工程师需耗时一天完成一片叶片的冷热态分析与结构转换,而一台喷气发动机拥有上千片叶片。如今,借助软件与硬件人员的协同工具,改变叶片几何形状后可实时获取结构与气动结果——两名工程师即可完成整台发动机的设计。
硬件协作工具走向“内部定制”
Vercel创始人Guillermo Rauch认为,这标志着企业软件的“大灾变”——硬件协作工具初创公司或将消亡。企业不再需要外部采购,而是内部自行编码所需工具。Excel之所以成功,只因过去无法快速构建定制软件,如今这一逻辑已被颠覆。
AI生成硬件设计指日可待
脑机接口公司Neuralink创始人Max Hodak透露,他已从Excel全面转向Python模型进行可信模拟。他预测,2026年内AI将由生成软件代码拓展至生成STEP文件和PCB布局图,彻底改变机械与电气工程领域。
中国开源模式:硬件优势叠加AI软件能力
Naval指出,中国大力投入开源AI模型的重要原因在于其硬件优势——拥有复杂供应链与零部件体系。通过开源模型,中国可快速按需生成软件,弥补与硅谷之间的软件差距。尽管OpenAI、Anthropic等闭源,但中国开源模型正迅速提升各类硬件配套软件的质量,从精密零件到亚马逊上售卖的电子小配件,其软件水平正快速升级。
智能模型选择:永远追求“最聪明”
在模型选择上,Naval坚持“智能是纯粹的益处”。他认为,当模型出错时用户无法察觉,因此应始终使用最智能的模型。Guillermo的数据则显示,开源模型确有使用,但前沿模型仍主导高端任务。Gemini在性价比上表现突出,但在编程领域,最顶尖的模型依然集中在两三家美国公司,中国模型尚未进入第一梯队。
软件仍需“双手”,但AI已进工厂
尽管AI在代码生成上突飞猛进,但Max强调,硬件制造最终需要物理操作。他的公司已自建MEMS晶圆厂,实现垂直整合。有趣的是,AI目前在公司内部最大的应用之一竟是法规合规——过去需要整个质管团队数月完成的标准匹配与文档生成,如今AI瞬间即可完成。
人类角色转变:从执行者到验证者
Naval总结认为,人类正在从执行者转变为“验证者”——就像律师审核法律文件,工程师不再逐行阅读代码,而是通过编写测试框架、模拟仿真和类型检查来确保系统安全上线。软件创建已变得极其容易,但长期维护、安全性与生产级性能仍是巨大挑战。未来,人类的核心价值在于“我理解这个变更的后果,我为其背书”。
中文翻译:
氛围编码硬件
用氛围编码设计涡轮叶片
尼维:嘿,布莱克,你在Boom Supersonic公司是如何应用这些的?
布莱克·肖尔:这彻底改变了软件与硬件开发者的角色。从第一天起,我们就尝试将大量传统工程工作流程——尤其是硬件工程工作流程——转化为软件形式。如果你没接触过硬件工程,我尽量解释清楚:很多硬件工程师的工作是在Excel电子表格中完成的,而且是在工程师各自的笔记本电脑上孤立地进行。这些表格非常复杂,有时还包含VBScript代码。实际上,这些全都是软件,但却被当作非软件来处理:没有源代码控制,没有自动化测试。如果想把工作从空气动力学家交接给结构工程师,只能通过邮件手动发送电子表格。这简直是1990年代的水平,糟糕透顶。
于是,我们开始构建软件框架,以实现硬件工程流程的自动化和可重复性,目标是降低迭代成本。但进展缓慢——我们永远雇不起足够的软件工程师。而现在,我们进入了一种令人惊叹的全新模式:软件工程师负责创建架构,因为他们理解系统、算法和关注点分离;硬件工程师则可以利用氛围编码完成自己那部分工作,因为他们精通硬件工程。结果就是,小团队的生产力发生了天翻地覆的变化。
举个例子:设计涡轮叶片时,传统上叶片初始是冷的,但运行时变热,尺寸会增大。你必须同时考虑空气动力学和结构设计,使其在冷态和热态下都能正常工作。你还需要在冷热状态之间、结构与空气动力学之间进行转换。过去,完成一片叶子的某部分分析,就需要一名工程师花一整天时间。而一台喷气发动机大约有一千片叶片,这根本做不了太多事。现在,通过软件和硬件人员协作创建解决方案,你可以改变叶片几何形状,并实时查看结构和空气动力学结果。两名工程师就能设计整台喷气发动机。效果天差地别。
吉列尔莫·劳赫:你提到的一点是,软件工程师为其他工程师创建工具和架构。在我看来,这正是企业软件领域最大的颠覆——如今已经没有哪家初创公司能靠销售硬件协作工具来盈利了。公司内部,你只需随时编写自己真正需要的代码。就连电子表格也快过时了。电子表格之所以成功,是因为过去没人能构建定制软件。而最接近定制软件的东西,就是带有一堆VBScript函数的电子表格。
纳瓦尔:没错——它们就是轻量级编程。
马克斯·霍达克:我个人几乎完全从Excel转向了Python建模,现在能对事物进行可信的模拟。AI还没做到的一点是——但我认为未来一年内(很可能在2026年)就会实现——那将非常激动人心:目前它只能生成软件,但很快它就能生成STEP文件和PCB布局图。当AI真正进入机械和电气工程领域时,那将是前所未有的全新局面。非常酷。
开源放大中国优势
纳瓦尔:在硬件方面,这对所有那些编写糟糕软件的小型设备公司和零部件公司来说是个福音——因为它们过去做不出好软件。现在它们将能做出足够好的软件。甚至可能不需要带人类界面,完全由智能体来访问,你只需通过语音与它对话来控制硬件。
这也是中国大力推行开源模型的原因之一。中国之所以全力投入开源,是因为它们拥有硬件优势。它们有极其复杂的供应链和零部件链条。它们基本在说:“嘿,如果我能按需生成软件,我就不再对硅谷处于劣势了。”这并非唯一的理由。它们也在追赶,做模型蒸馏,合作共享资源。但中国政府历来会资助那些有助于整个生态系统发展的项目,尤其是在网络效应业务中。它们想整合所有资源,在AI领域迎头赶上,并用它来增强自己的硬件优势。
讽刺的是,中国之所以大力开源,恰恰是因为OpenAI不够开放。Grok虽然发布模型,但总落后一两代。Google有一些本地模型,但毫无竞争力。据我所知,Anthropic甚至没有开源模型。因此,所有开源的重担都落到了中国身上。这确实帮助了我们的硬件创业者,但更帮助了它们自己的硬件创业者和工厂。所有那些你周六下午懒洋洋地捣鼓时,在亚马逊上买来各种小玩意儿所附带的劣质软件——这些软件正飞速变得更好。
吉列尔莫:所有人都被敲醒了:没有顶尖的编码模型,你就无法实现自我进化。想象一下,如果整个中国无法生产顶尖的一切——这不仅仅是生产软件的问题。在硬件流程的任何一个环节,就像布莱克说的,你都需要生成软件。如果你在软件生成能力上落后,你就无法生产任何东西。
你永远想要最聪明的模型
吉列尔莫:我很好奇一件事:大家都喜欢谈论中国模型。你们用中国模型吗?你们认识谁在用中国模型吗?
纳瓦尔:没有。这是我昨晚在晚餐时争论的一个话题。桌上有人说,97%的事情用DeepSeek就够了,因为它太便宜了;如果需要更高智能,就反复运行同一个问题。只有最复杂的任务才用OpenAI、Anthropic等。我当时觉得不好说。我认为智能是纯粹的好东西。你永远想要更多智能。当这些模型犯错时,你却不知道。而且它永远比真人便宜,还能实时工作。
所以,你只会用最智能的可用模型。这不太妙,因为这意味着你可能最终会在AI领域形成垄断或寡头格局。但我总想用最聪明的程序员。我总想要最正确的答案。我总想要最好的判断力。考虑到我要投入的杠杆——资本、代码、人力和营销——我每次都希望做出正确决策。当我有两个模型,我知道其中一个比另一个稍微聪明一点,它们都给出答案时,我往往不知道哪个才是正确答案。所以如果我知道一个模型更聪明,我就会采用它的答案,最终我会停止向我认为不够聪明的模型提问。你们发现这些所谓“不够聪明”的模型有什么用处吗?
吉列尔莫:还是有用的。我们有AI网关数据——基本上所有应用和智能体都经过它。开源模型肯定有人用,但排名前列的仍然是前沿智能模型。不过有个前提:性价比合理的前沿智能模型,在大规模应用中表现强劲。比如Gemini——人们对它并不特别兴奋,但它发布的模型在合适的性能与成本组合下非常聪明。有趣的是,对于除了编程之外的许多任务,它们反而是最好的模型。最好的工业化生产模型。你可以把它们用于支持任务或浏览器自动化。我总会把Gemini模型放在那里,并在类似任务上考虑中国模型。但任何时候,只要我想突破前沿,就需要最好的编码模型。基本就两三个模型。中国模型显然不在其中。
软件仍然需要动手操作
尼维:马克斯,你正在大力推动垂直整合和极端紧迫感。谈谈吗?
马克斯:很多情况下,你买不到现成的东西,所以必须自己动手制造。显然,我们不会在像前沿模型这样的东西上这么做——我订阅了Anthropic。实际上,我们确实用一些中国模型,正如纳瓦尔所说。我们用了一些Qwen模型和DeepSeek模型。我们内部对3.2版本做了一个大型微调,用于很多事情——我们很快会考虑迁移到4版本。但这是个人层面的,不是公司层面的。
我们的偏好永远是购买现成的东西。如果有供应商以优惠价格提供服务,比如PCB板,我们就不自己生产。PCB板基本上是免费的,可以从亚洲无限量购买。但我们的产品越接近一块共价键结合的整体材料,性能就越好:功耗更低、体积更小、性能更高、寿命更长。但这需要一些市面上没有的组件。要实现这种集成——要真正创新,而不仅仅是拼凑现成的组件(这其实非常受限)——你必须学会自己动手。这体现为垂直整合。因此,我们在东海岸拥有一家自有的MEMS代工厂。否则,我们无法实现想要的那种封装和组装方式。
未来几年,所有这些都将受到AI的巨大影响。虽然现在还没完全到位。讽刺的是,AI在公司内部最大的影响之一体现在监管合规方面。如果我们能自动生成文档,或者能提问:“我们想改进这款产品,可能有几千个ISO标准适用,哪些我们必须遵守?请梳理清楚”——过去,这需要一个单独的法规与质量团队忙活几个月。现在AI基本能直接给出答案。
当我想到手术项目或MEMS晶圆厂时——归根结底,软件仍然需要动手操作。它会比我们更聪明,但如果它不能实际制造东西,那就是真实的边界。我们已经对我们的代工厂以及公司许多其他部分进行了仪器化改造,随着这些模型变得更好,这种改进应该会立即体现在我们正在进行的细胞工程和材料科学研究中。我们的蛋白质工程团队大量使用深度学习——我认为我们在该领域可能处于顶尖水平。但这非常依赖于具体应用。在公司不同部门,意义也各不相同。没有统一的答案。
人类正在成为验证者
纳瓦尔:马克斯提到的监管流程让我意识到——我已经很久没用律师起草基本法律文件了。我不再请律师处理保密协议、各种协议、签字、研究等事务。所有基础法律工作也都消失了。有个老笑话:法律就像意大利面条代码——非常复杂的代码,却试图用英语写出来。它会与这里的代码冲突,还必须适配那里的代码。根本不存在真正的API。
对于初级工程师和初级工程任务——初级工程师实际上被提拔为高级工程师,而初级工程任务被智能体取代了。同样,在法律领域,你可以说“律师助理刚被解雇”,或者也可以说“律师助理刚被提拔为高级律师,现在他们可以花时间思考法律问题了”。
吉列尔莫:思考软件工程与法律之间的相似之处确实很有趣。你永远不知道律师在这些文档里写了什么——你只是信任他们。“嘿,律师,能看看这份文件吗?告诉我它是否合法?能标出修改痕迹吗?”你在与律师的关系中珍视的是,他们是被信任的权威。他们上过法学院。他们在拿自己的声誉作担保。
这与软件工程有共通之处。如今最大的问题之一是一堆草率代码最终变成PR(拉取请求)。推特上有很多梗:“以前我们会逐行阅读PR的每一行代码。”但在我的领域——基础设施——我希望工程师能说“我理解”那行PR中的每一行代码。这不意味着你真的读了每一行,而是意味着你能说“我理解这个PR的后果。我签字确认理解其后果。”或者“我编写了测试框架、模拟、证明、类型检查器——即使没读这个PR,我也能自信地签字,确认它在生产环境中是安全的。”
我们或许会接受这样一个世界:所有代码都将变成我们不完全理解的意大利面条,但我们编写评估工具来建立信心,并依赖人——基础设施生产工程师——来拍板:“好,我同意把它发送到生产环境。”如果你的系统出问题,总会有人被呼叫。还有一件人们低估的事:创建软件非常容易,从零到一。但想想一千天之后。你的软件会是什么样子?安全吗?经过测试了吗?达到生产标准了吗?性能好吗?而且,你还愿意投入那么多令牌来在生产环境中维护它吗?
纳瓦尔:人类正在成为验证者。我们就是这样训练这些模型的——用高质量的验证数据——现在我们需要人类验证者。过去许多人、律师、工程师、运营人员的职能,正在转向验证整个技术栈,并说:“对,这大致正确。我大致会支持它。如果出了问题,我会站在你这边。”
英文来源:
Vibe Coding Hardware
Vibe Coding a Turbine Blade
Nivi: Hey Blake, how are you applying all this at Boom Supersonic?
Blake Scholl: It completely changes the role of software and hardware developers. From day one we tried to take a lot of traditional engineering workflows—hardware engineering workflows—and turn them into software. If you haven’t been around hardware engineering, let me try to make this clear. A lot of hardware engineering happens in Excel spreadsheets on engineers’ laptops in a silo. Very complex spreadsheets, sometimes with VBScript code. All of this is actually software, but it’s treated as if it’s not. There’s no source control, no automated testing. If you want to hand something off from an aerodynamicist to a structures engineer, that’s done manually with a spreadsheet over email. It’s the nineteen-nineties. It’s terrible.
So we started building software frameworks to automate and make repeatable hardware engineering flows, with the idea that we could reduce the cost of iteration. But it was slow going—we could never afford enough software engineers. What we’ve gotten into now is a mind-blowingly different model: the software engineers create the architectures, because they understand systems, algorithms, and division of concerns. Then the hardware engineers can vibe-code their pieces because they know hardware engineering. The result is mind-blowingly different productivity for small teams.
Example. If you’re designing a turbine blade—classically, a turbine blade starts cold, but when it runs it gets hot, so it gets bigger. You have to design both the aerodynamics and the structural design to work in its cold shape and its hot shape. You have to convert between cold and hot, between structures and aerodynamics. This takes one engineer one day for one blade for one piece of the analysis. There are about a thousand blades in a jet engine. You can’t do much. Now, with a combination of software and hardware people creating the solution, you can change blade geometry and see in real time the structures and aerodynamics results. Two engineers can design an entire jet engine. Wildly different.
Guillermo Rauch: One of the things you mentioned is that software engineers are creating the tools and architectures for the rest of the engineers. To me, that’s the biggest cataclysm of enterprise software—there’s no startup that builds hardware collaboration tools that can sell you anything anymore. Internally, you’re just coding the right thing you need at any given time. Even spreadsheets are kind of cooked. The reason spreadsheets were successful is that no one could build custom software. The thing that approximates custom software the most is a spreadsheet with a bunch of VBScript functions.
Naval: Right—they’re lightweight programming.
Max Hodak: I’ve personally moved almost entirely from Excel to Python models, where I can get believable simulations of things. The thing AI hasn’t come to yet, but I think it will within the next year—probably within 2026—and that will be very exciting: right now it can generate software, but soon it will generate STEP files and PCB layouts. When it comes for mechanical and electrical engineering, that’s a whole other thing we haven’t seen yet. Very cool.
Open Source Compounds China’s Advantage
Naval: On the hardware side, this is a boon for all these little gadget companies and part companies that write really bad software because they can’t make great software. Now they’re going to be able to make good-enough software. Or it may not even be software with a human front end—it might just be completely agentic, an agent accessing it, and you talk to it through voice to control hardware.
This is one of the reasons China is big into open-source models. They’re going all in on it because they have hardware superiority. They have these very complex supply chains and component chains. They’re basically saying—“hey, if I can just generate software on demand, then I don’t have this disadvantage anymore against Silicon Valley.” That’s not the only reason they’re doing open source. They’re also behind, they’re distilling models, they’re catching up, they’re collaborating on resources. But the Chinese government has a history of funding efforts that help their entire ecosystem along, especially in network-effect businesses. They want to pool all their resources, catch up on AI, and use it to give their hardware stuff an advantage.
Ironically, they’re doing all the open-source stuff because OpenAI is not open. Grok publishes models, but they’re a model or two behind. Google has some local models, nothing really competitive. Anthropic, to my knowledge—I don’t even know of any open-source models from them. So all the open-source heft is coming from China. It helps our hardware founders, but it helps their hardware founders and factories that much more. All the crappy little software that goes with all the random knickknacks and thingamajigs you buy off Amazon to tinker with on a lazy Saturday afternoon—that software’s getting a lot better very quickly.
Guillermo: Everyone’s had the wake-up call that without great frontier coding models, you don’t have self-improvement. Imagine China as a whole not having the ability to produce frontier everything. It’s not just about producing software—in any piece of this hardware pipeline, like Blake was saying, you need to generate software. If you fall behind in your ability to generate software, you fall behind in your ability to generate everything.
You Always Want the Smartest Model
Guillermo: One thing I’m curious about: everyone loves to talk about Chinese models. Do you guys use Chinese models? Do you know anybody who uses Chinese models?
Naval: No. This is an argument I had yesterday at dinner. One person at the table was claiming you’ll just use DeepSeek for 97% of things because it’s so cheap, and if you need more intelligence you’ll just run it over and over again—the same problem. You’ll only use OpenAI, Anthropic, etc. for the most advanced tasks. I was kind of like, “I don’t know.” I think intelligence is an unalloyed good. You always want more intelligence. When these models make a mistake, you don’t know it. And it’s always cheaper than a real person, and real-time.
So you’ll just use the most intelligent model available. Which isn’t great news, because it means you’ll end up creating a monopoly or oligopoly situation in AI. But I always want the most intelligent programmer. I always want the most correct answer. I always want the best judgment. Given the amount of leverage I’m going to pour into it—through capital and code and people and marketing—I want to make the right decision every time. When I have two models, one I know is a little smarter than the next, and they both give me answers, often I don’t actually know which is the correct answer. So if I know one model is a little smarter, I’m going to go with that answer, and eventually I’m going to stop asking the model I think is less intelligent. Have you guys found a use for these so-called less intelligent models?
Guillermo: We see uses. We have AI Gateway data—basically every application agent goes through it. There’s definitely usage of open models, but the top is heavily dominated by frontier intelligence. There’s a caveat: frontier intelligence at reasonable cost and performance slaps at scale. Gemini—people don’t get really excited about Gemini, but they put out models that are super smart at the right performance-cost combination. For a lot of tasks other than coding, interestingly enough, they’re the best models. The best industrial production models. You can throw them at support tasks or browser automation. I’d always put a Gemini model there, and I’d look to Chinese models for those kinds of things. But any time I’m working to push the frontier, you need the best possible coding model. That’s basically two or three models. The Chinese are certainly not in it.
Software Still Needs Hands
Nivi: Max, you’re pushing pretty hard into vertical integration and extreme urgency. Want to talk about that?
Max: For many things, you can’t buy it, so you have to make it somehow. We obviously don’t do this on things like frontier models—I have an Anthropic subscription. We actually do use some of the Chinese models, to Naval’s point. We use some Qwen models and DeepSeek models. We have a big internal fine-tune of 3.2 that I use for a bunch of things—we’re going to look into porting to 4 soon. But that’s on the personal side, not on the company side.
Our preference would always be to buy something. If there’s a vendor that offers a service at a great price—for example, PCBs. We don’t make PCBs. Those are basically free. You can buy them in unlimited quantity from Asia. But the closer our products get to being a single block of covalently bonded matter, the better they’ll be. Lower power, smaller, higher performance, longer lasting. The components aren’t available. In order to do that type of integration—to actually innovate beyond just piecing together things you can buy off the shelf, which is really very limiting—you have to learn to do it yourself. That shows up as vertical integration. So we own a captive MEMS foundry on the East Coast. There was no other way to do the type of packaging and assembly we wanted to do.
All of this is going to be affected heavily by AI over the next few years. It’s not quite there yet. Ironically, one of the biggest impacts we’ve seen of AI inside the company is in regulatory interactions. If we can generate documentation, or if we can ask—“we want to evolve this product, there are thousands of ISO standards that might apply, which ones do we have to comply with, trace this through”—that used to require a whole regulatory and quality team for several months. Now the AI just kind of knows.
When I think about stuff like the surgical program or the MEMS fab—ultimately the software still needs hands. It’s going to be smarter than us, but if it can’t make things, those are real boundaries. We’ve instrumented our foundry as well as many other parts of the company in ways where, as these models get better, that should show up pretty immediately in things like the cell engineering we’re doing and the material science we’re developing. Our protein engineering group really uses deep learning a lot—I think we’re probably state of the art there. But it’s very application-specific. It means different things in different parts of the company. There’s not one answer.
Humans Are Becoming Verifiers
Naval: What Max was talking about with regulatory stuff makes me realize—it’s been a while since I generated a basic legal document using a lawyer. I stopped asking lawyers for NDAs, agreements for this, sign that, research this. All the basic legal tasks are gone too. There’s the old joke that law is like spaghetti code—very complicated code they try to put in English. It contradicts this code over here, has to fit into that code over here. There are no real APIs for it.
For junior engineers and junior engineering—junior engineers basically got a promotion to senior engineer, and junior engineering got taken over by agents. The same way, in law, you can say “paralegals just got fired,” or you can say “paralegals just got promoted to senior lawyers, and now they can spend their time thinking about the law.”
Guillermo: It’s actually interesting to think about the parallels between how software engineering is evolving and lawyers. You never know exactly what lawyers put into these documents—you just trust them. “Hey, lawyer, can you look at this document? Can you tell me if it’s legit? Can you do red lines?” What you’re valuing in the relationship with a lawyer is that they’re a trusted authority. They went to law school. They’re putting their reputation on the line.
There’s a parallel with software engineering. The biggest problem today is this mountain of slop that ends up as a PR. There are all these memes on Twitter—“way back in the day we used to read every line of code of a PR.” Well, in my world—infrastructure—I want engineers to be able to say “I understand” every line of that PR. That doesn’t necessarily mean you’ve read every line. It means you can say “I understand the consequences of this PR. I’m signing off on understanding the consequences.” Or, “I wrote the test harness, the simulations, the proofs, the type-checkers—even without reading this, I have confidence I can sign off that it’ll be safe in production.”
There’s a world in which we embrace that everything is going to be spaghetti code we don’t fully understand, but we write the evaluators that give us confidence, and we rely on people—the infrastructure production engineers—to say, “Okay, I’m fine sending this into prod.” Someone is going to get paged if your systems go down. Another thing people are underestimating: creating software is really easy, zero to one. But think about a thousand days from now. What does your software look like? Is it secure? Is it tested? Is it production-grade? Is it performant? And are you still motivated to invest all those tokens in maintaining it in prod?
Naval: Humans are becoming verifiers. That’s how we train these models—with good verification data—and now we need human verifiers. A lot of the old function of people, lawyers, engineers, operations people, moves to verifying the stack and saying, “Yeah, this is roughly correct, I’ll roughly stand behind it, I’ll support you if it goes wrong.”