程序员拒绝在没有AI辅助的情况下工作——而这可能会反过来给他们带来麻烦。

qimuai 发布于 2026-5-30 10:00 阅读：29 一手编译

内容来源：https://techcrunch.com/2026/05/29/coders-are-refusing-to-work-without-ai-and-that-could-come-back-to-bite-them/

内容总结：

2026年调查显示：开发者已离不开AI编程工具，但代码质量隐忧浮现

2026年，人工智能编程工具已成为开发者手中不可或缺的生产力工具。然而，最新研究揭示了一个令人警觉的现象：虽然AI明显加快了代码编写速度，但未必能提升代码质量，甚至可能为未来埋下隐患。

今年2月，知名AI研究机构METR发布了一项惊人发现：大多数开发者已不愿在没有AI辅助的情况下工作，哪怕是参与一项有限任务的实验。METR原计划复制其2025年的一项开创性研究——对比开源开发者手工与借助AI完成任务的时间差异。2025年的研究曾意外发现，尽管开发者自认为AI提高了效率，实际结果却显示AI反而拖慢了整体进度：生成的代码虽快，但修复错误、指导AI以及等待AI完成任务耗费了更多时间。

当METR试图在2026年重复实验以评估AI与开发者能力的进展时，却遭遇了“参与者难寻”的困境。研究者坦言：“开发者不愿在没有AI的情况下工作，即使是为了研究。”随后，METR于5月发布的调查报告显示，技术人员自我评估称AI使其对组织的贡献价值翻倍。然而，结合近期关于“Token暴发户”式高额开销的新闻，这类自我认知的可信度正受到质疑。

所谓“Token暴发户”，即通过消耗AI代币数量来衡量工作效率的趋势，在2026年一度盛行，但似乎已开始退潮。据《金融时报》报道，亚马逊关闭了其内部Token追踪排行榜“Kirorank”，原因是部分员工为刷榜而过度使用AI代理，导致成本激增。事实证明，AI使用量并不等同于生产力提升。另据《The Information》报道，优步在2026年第一季度就耗尽了全年AI预算，首席运营官Andrew Macdonald坦言，高额支出并未带来可量化的项目增长或效率提升。

程序员兼作家James Shore在Hacker News上引发广泛讨论的博客中指出，AI生成的代码未必能降低后续维护成本，甚至可能增加维护负担。“现在写代码速度快了一倍？最好祈祷你的维护成本也砍半了，”他警告说，“否则你就麻烦了——你是在用暂时的速度提升换取长期的‘债务奴役’。”

更多证据表明AI可能加剧代码维护困境。可靠性工程代理初创公司Entelligence AI的创始人Aiswarya Sankar在推文中透露，企业高达44%的AI代币消耗竟用于修复AI自己生成的错误。代码审查工具公司CodeRabbit则分析开源代码库后发现，AI产出的问题数量比人工代码高出1.7倍。尽管这些数据存在商业推广之嫌，但独立研究也提供了佐证。新加坡管理大学4月发布报告警告：“AI生成的代码可能为真实软件项目引入长期的维护成本。”

面对这一矛盾，AI编程助手销售商建议使用更高级的AI代理来承担乏味的修复工作，相当于“以AI治AI”。Devin（AI编程代理）的创造者、Cognition公司创始人兼CEO Scott Wu承认，Devin的独立工作能力目前仅介于初级与中级程序员之间，远非“放手不管”的解决方案。

新加坡管理大学的研究者则提出更为“人性化”的建议：程序员应当像了解自己最熟悉的编程语言一样，深度掌握AI擅长与不擅长的任务领域；企业需要建立专为AI设计的强大质量保障体系，并坚持像对待初级开发者那样仔细审查AI的代码。同时，研究者与Wu达成共识：软件架构、安全设计等宏观决策工作，仍应由人类主导。

中文翻译：

研究人员发现，到2026年，开发者们已离不开AI编程工具。尽管AI无疑能帮助程序员更快地编写代码，但其他研究人员警告，这可能并不意味着代码质量更高，并可能在日后给他们带来麻烦。具体而言，2026年2月，备受尊敬的AI研究实验室METR发布了一项惊人发现：大多数开发者已不愿在没有AI的情况下工作，即使只处理有限的任务。METR原本希望更新2025年初发表的一项关于AI编程生产力的开创性研究。在那项研究中，研究人员测量了开源开发者手工完成任务与使用AI完成任务所需的时间对比。虽然参与该研究的开发者报告称AI提高了他们的生产力，但他们惊讶地发现，实际上AI反而拖慢了他们的速度。诚然，AI生成代码更快，但开发者随后需要花费额外时间去查找和修复错误、指导AI，并等待AI完成所有任务。当METR试图重复该实验以衡量AI和程序员熟练度的进步时，他们却无法进行。研究人员坦言，开发者不愿意参与，“因为他们不想在没有AI的情况下工作”，哪怕只是为了这项研究。于是，METR在2026年5月发布了一项调查，让技术人员自我报告AI带来的生产力提升。不出所料，他们认为AI使自己对组织的价值翻倍。但近期关于所谓“tokenmaxxing”（疯狂消耗token）高昂成本的新闻，结合零星的调查研究，使这种自我认知变得可疑。“Tokenmaxxing”（即用个人消耗的token数量作为使用AI生产力的指标）已成为2026年至今的趋势。而且这个趋势可能已经结束。《金融时报》本周报道，在员工通过过度使用AI代理并推高成本来钻空子之后，亚马逊关闭了其内部token追踪排行榜“Kirorank”。这些员工的行为证明，使用AI并不自动等同于生产力提升。据The Information报道，Uber在2026年前四个月就用完了全年的AI预算。首席运营官Andrew Macdonald最近在一次播客中表示，这笔支出并未带来项目数量或生产力的可衡量增长。程序员兼作家James Shore在Hacker News上的一篇热门博文中雄辩地指出，AI生成的代码不一定能减少持续性的代码维护需求，甚至可能增加。“现在你写代码快了一倍？最好祈祷你的维护成本也减半了，”他写道，“否则你就惨了。你不过是用暂时的速度提升换取长期的束缚。”还有其他证据表明AI会加剧代码维护的困境。可靠性工程代理初创公司Entelligence AI的创始人兼CEO Aiswarya Sankar的一条病毒式推文声称，公司花费了44%的token来修复由AI生成的代码中的错误。与此同时，代码审查工具公司CodeRabbit表示，其分析开源拉取请求后发现，AI产生的问题比人类代码多1.7倍。诚然，这些数据来自那些试图销售AI代码审查工具的公司，带有自身利益色彩。然而，独立研究人员也发现了此类问题。来自备受尊敬的新加坡管理大学的研究人员在2026年4月发布了一份报告，警告称“AI生成的代码可能给真实软件项目带来长期的维护成本”。鉴于程序员们如此喜爱他们的AI助手，解决方案是什么呢？嗯，那些想向你推销AI编码代理的人说，开发者可以使用AI编码代理来执行修复代码这类繁琐的工作，速度与AI生成代码一样快。这就是AI编码代理Devin的创造者——Cognition公司创始人兼CEO Scott Wu的建议。但连他自己也承认，尽管Devin可以独立工作，他目前根据具体任务，将其技能水平评定为介于初级和中级程序员之间。这并非一个可以甩手不管、一劳永逸的解决方案。新加坡管理大学的研究人员则提出了一种更人性化的方法。程序员应该像熟悉自己最爱的编程语言一样，深刻了解AI擅长和不擅长的任务。他们需要为AI设计强大的质量保证系统，并且必须像对待初级开发者一样，仔细审查AI的工作成果。同时，研究人员表示（Wu也同意这一点），人类仍应负责宏观层面的工作，比如软件架构和安全性设计。

英文来源：

In 2026, you cannot pry AI coding tools out of developers’ vise grip, researchers have discovered.
But while AI is undoubtedly helping coders produce code faster, it may not be producing better code, other researchers warn. And that could cause problems down the road for them.
Specifically, in February 2026, respected AI research lab METR published a surprising revelation: Most developers won’t work, even on a limited number of tasks, without AI anymore.
METR had hoped to provide an update to some groundbreaking research published a few months earlier, in 2025, on AI coding productivity. In it, researchers measured how much time open source developers took to do tasks by hand versus with AI.
While developers in that study reported that AI was making them more productive, they were shocked to learn it actually slowed them down. Sure, it generated code faster, but then they spent extra time finding and fixing errors, steering the AI and waiting on it to complete tasks.
When METR set out to repeat the experiment to measure advances in AI and coder proficiency, they couldn’t.
Devs weren’t willing to participate “because they do not wish to work without AI” even just for the study, the researchers confessed.
Instead, METR published a survey in May that allowed technical employees to self-report their AI productivity gains. Not surprisingly, they perceived that AI made them twice as valuable to their organizations.
But recent headlines about the wild expense of so-called tokenmaxxing, coupled with a smattering of recent research, make such self-perceptions dubious.
Tokenmaxxing, or using the number of tokens a person uses as a proxy for productivity with AI, has been the trend of 2026 so far. And it may already be over.
Amazon shut down its internal token-tracking leaderboard called Kirorank after employees were gaming it by using AI agents excessively, and running up costs, the Financial Times reported this week. The employees proved that AI use does not automatically translate to increased productivity.
Uber blew through its 2026 AI budget within the first four months of the year, The Information reported. COO Andrew Macdonald recently said on a podcast that such spending hadn’t led to a measurable increase in projects or productivity.
AI-generated code also doesn’t necessarily reduce ongoing code maintenance needs and may even increase it, programmer and author James Shore elegantly argued in a blog post that went viral on Hacker News.
“You write code twice as quick now? Better hope you’ve halved your maintenance costs,” he wrote. “Otherwise, you’re screwed. You’re trading a temporary speed boost for permanent indenture.”
There’s other evidence that AI can increase code maintenance woes.
A viral tweet from Aiswarya Sankar, founder and CEO of reliability engineering agent startup Entelligence AI, proclaims that companies are spending 44% of their tokens on bug fixes that their AI generated. Meanwhile, code-reviewing tool company CodeRabbit says it analyzed open source pull requests and found that AI produced 1.7x more problems than human code.
Those are, admittedly, self-serving stats from those trying to sell AI code reviewing tools.
Yet independent researchers have also found such issues. Researchers from the respected Singapore Management University published a report in April warning that “AI-generated code can introduce long-term maintenance costs into real software projects.”
Given that programmers love their AI assistants, what’s the solution?
Well, those who want to sell you AI coding agents say devs can just use AI coding agents to do the bone-wearying tasks of fixing code as fast as AI spits it out. That’s what Cognition founder and CEO Scott Wu —the maker of AI coding agent Devin — suggests.
But even he admits that, while Devin can work independently, he’d currently rate its skill between a junior and mid-level programmer, depending on the task. This is not a hand-it-off and forget it solution.
The SMU researchers suggest a more human approach. Programmers should know what tasks AI does and doesn’t do well as deeply as they know their favorite coding languages. They need strong quality assurance systems designed for AI and they are stuck with carefully reviewing the AI’s work as if it were a junior dev.
Meanwhile, the researchers say (and Wu agrees), humans should still be doing the big-picture work like software architecture and security design.

TechCrunchAI大撞车

文章目录

📚 推荐阅读

扫描二维码，在手机上阅读