观点：为何代币计数掩盖了人工智能投资的回报率

qimuai 发布于 2026-6-30 11:01 阅读：1 一手编译

内容来源：https://aibusiness.com/agentic-ai/opinion-token-counting-obscures-roi-ai-investments

内容总结：

谷歌云观点：企业部署生成式AI须警惕“用量幻觉”，建立价值导向的成本模型

随着生成式AI在企业中的快速普及，一个普遍误区正在形成：许多组织将Token消耗量、模型调用次数、提示词数量等实时使用指标等同于成功标志，甚至引入“Token排行榜”来激励员工使用AI。然而，谷歌云在一篇最新分析中指出，消费数据并不等同于产出价值——一名工程师可能消耗数千Token，但若这些消耗未关联到具体的功能开发、缺陷修复或业务目标，企业根本无法判断这笔支出是否创造了实际价值。

这种“归因缺口”正在加剧。传统工程测量框架诞生于AI时代之前，无法捕捉到编码助手、自主智能体或AI工作流带来的差异化效果。研究显示，94%的工程负责人认为，当前测量体系缺失了最重要的衡量指标。其结果是：企业看见了AI的成本，却看不见AI的产出。

要弥合这一缺口，企业必须从基于用量的简单衡量，转向对AI驱动交付成本与价值的全景式评估。例如，一个编码助手可能数秒内生成代码，但真实的交付成本还包括编写、测试、安全部署的底层基础设施，以及验证输出、解决下游问题所耗费的时间。仅盯着Token开销，很可能掩盖价值究竟在哪里被创造——或在哪里被流失。

与此同时，不同AI任务在不同层级拥有不同的成本画像。尖端大模型适用于复杂架构决策或高度专业化的开发，但在日志总结、文档生成或模板代码编写等常规任务中可能“大材小用”，造成额外浪费。明确何时该用AI、用哪个模型、何时采用非AI方案更划算，正成为软件交付经济学的核心课题。

这并非工程团队首次面对成本逻辑的重构。早年云计算曾推动开发者从粗放的基建预算转向精细的单位经济核算，如今AI正在引发类似的变革。

谷歌云建议，企业应建立一套将AI支出与工程产出主动关联的交付模型，让Token消耗、提示词、会话及生成代码自动归因到对应的开发者、团队、代码仓库和业务部门。缺乏这种可追溯性，AI的使用数据就始终游离于实际成果之外。

此外，工程负责人还需借助自动化工具识别工作流中的低效环节，例如：AI投入生成的代码最终从未上线、可用轻量模型却动用了昂贵大模型、以及无助于交付效率提升的提示词产生不必要成本。只有在成本积压前就能看到问题，才能真正实现优化。

这种可见性还需从代码生成延伸至生产环境。AI赋能的工作应从“提示词→拉取请求→部署”端到端追踪，并结合交付速率、PR周期时长、DORA指标以及事故与质量数据进行关联分析。唯有如此，企业才能判断AI究竟是加速了软件交付，还是引入了新的返工负担。

最后，组织需要建立统一的基准，以跨团队横向评估采纳率、效率与影响。否则，AI用量增长很容易被误解为进步，而实际上并不代表绩效提升。

当企业能够将AI用量与交付成果挂钩时，价值的真相才会浮现。 一个真正的成本模型，不仅仅显示钱花在了哪里——它更揭示出哪些工作正在推动实际的业务影响。率先做到这一点的组织，不仅能降低AI开销，更将从根本上优化“该建什么”的决策方式。治理意识和成本意识融入工作流，AI将从“默认必用”的工具，转变为在恰当情境下审慎调用、以创造真正差异化为目标的工程利器。

中文翻译：

由谷歌云赞助
选择您的首批生成式AI应用场景
要开始使用生成式AI，首先要聚焦于能够改善人类获取信息体验的领域。

真正的成本模型不仅显示资金流向，更能揭示哪些工作正在产生影响力。

随着AI应用的普及，组织对AI的使用方式获得了前所未有的可见度。他们可以实时查看Token消耗量、模型使用情况、提示词数量和采用率。这些指标正越来越多地被误认为是成功的标志。在某些情况下，组织甚至引入了Token排行榜来鼓励AI应用。

然而，消耗量几乎无法反映实际成果。一位工程师可能消耗了数千个Token，但如果没有与某个功能、修复或业务目标关联，就无法判断这笔开支是否创造了价值。财务团队看到成本，工程团队看到使用量，但两者都难以将这些数据与实际达成的成果联系起来。

问题因许多工程衡量框架早于AI时代而进一步加剧。传统的交付指标仍然有用，但其设计初衷从未考虑捕捉编码助手、自主代理或AI驱动工作流程所带来的差异。事实上，研究发现94%的工程主管认为，当前衡量框架中缺少了最重要的指标。

其结果就是日益扩大的归因缺口。组织能看到AI的成本，却看不到它创造了什么。

为了填补这一缺口，组织必须超越基于使用量的指标，转向更全面地审视AI驱动的交付成本与价值创造。AI使用量和供应商支出只反映了问题的一小部分。编码助手可能在几秒钟内生成代码，但交付的真实成本还包括构建、测试、安全保护和部署该代码所需的基础设施，以及验证输出结果和解决后续问题所花费的时间。仅关注Token可能会掩盖价值在哪里被创造——或者被浪费。

与此同时，每项AI辅助的任务在不同层面上都有不同的成本特征。前沿模型可能适用于复杂的架构决策或高度专业化的开发工作，但对于记录日志、生成文档或创建模板代码等常规活动，其附加价值可能微乎其微。在某些情况下，AI可能并非最有效率的选择。理解何时使用AI、使用哪种模型以及何时采用其他更经济高效的方法，正成为软件交付经济学中的关键组成部分。

工程团队并非第一次需要重新思考成本问题。不久之前，云计算促使开发人员从宽泛的基础设施预算转向更精细的单位经济学。AI正在引发类似的转变。

一种将AI支出与工程和生产力成果主动关联的交付模式，将帮助组织更好地衡量其投资回报率。这首先要将AI活动直接与其产生的工作挂钩。Token消耗、提示词、会话和生成的代码需要自动归因于负责交付的开发者、团队、代码仓库和业务部门。没有这种级别的可追溯性，AI的使用情况就仍然是孤立可见，却与实际成果脱节。

工程主管用于识别工作流程中效率低下的流程同样需要实现自动化。组织需要能够看到：AI支出是否流向了从未交付的代码，昂贵模型是否被用于轻量级模型即可胜任的场景，以及提示词和工作流程是否在未改善交付的情况下产生了不必要的成本。缺乏这种洞察，效率低下的问题只有在累积到一定程度后才变得可见。

这种可见性还需要从代码生成一直延伸到生产环境。AI生成的工作应该实现端到端追踪——从提示词到拉取请求再到部署——并结合交付指标（如交付率、PR周期时间、DORA指标）与事故和质量数据进行关联分析。只有这样，组织才能理解AI究竟是在改善软件交付，还是在引入新的返工形式。

在所有这一切中，组织需要一个一致的基线来跨团队衡量采用率、效率与影响力。没有这个基线，AI使用量的增长很容易被解读为进步，即使这些增长并未转化为实际性能的提升。

当组织能够将AI使用与交付成果联系起来时，他们就开始理解价值所在。真正的成本模型不仅显示资金流向，更能揭示哪些工作正在产生影响力。

率先做到这一点的组织，将不仅能降低AI支出，还能改善他们从一开始就决定构建什么的方式。治理和成本意识将嵌入工作流程，使AI使用具备上下文感知能力——在能改善交付时应用，在不能改善时避免使用，并与能带来更好价值的更简单或更便宜替代方案保持平衡。在这种环境下，AI不再是默认选项，而成为经过深思熟虑后用于交付有意义软件的工具。

英文来源：

Sponsored by Google Cloud
Choosing Your First Generative AI Use Cases
To get started with generative AI, first focus on areas that can improve human experiences with information.
A true cost model doesn’t just show where money is being spent -- it reveals which work is driving impact.
As AI adoption grows, organizations have unprecedented visibility into how AI is being used. They can see token consumption, model usage, prompt volumes and adoption rates in real time. These metrics are increasingly being mistaken as indicators of success. In some cases, organizations have even introduced token leaderboards to encourage AI adoption.
However, consumption reveals little about outcomes. An engineer might consume thousands of tokens, but without attribution to a feature, fix or business objective, it's impossible to know whether that spend created value. Finance teams see costs, engineering teams see usage, but neither can easily connect the two to what was ultimately achieved.
The problem is compounded by the fact that many engineering measurement frameworks predate AI. Traditional delivery metrics remain useful, but they were never designed to capture the difference that coding assistants, autonomous agents or AI-powered workflows make. In fact, research found that 94% of engineering leaders believe the metrics that matter most are missing from their current measurement frameworks.
The result is a growing attribution gap. Organizations can see what AI costs, but not what it creates.
To address this gap, organizations must move beyond use-based metrics toward a more complete view of AI-driven delivery cost and value creation. AI use and provider spend capture a narrow part of the picture. A coding assistant might generate code in seconds, but the real cost of delivery also includes the infrastructure required to build, test, secure and deploy it, as well as the time spent validating outputs and resolving issues downstream. Looking at tokens alone risks obscuring where value is created -- or lost.
Meanwhile, each AI-assisted task carries a different cost profile across the different layers. A frontier model might be justifiable to use for complex architectural decisions or highly specialized development work, but it might provide little additional value for routine activities such as summarizing logs, generating documentation or creating boilerplate code. In some cases, AI might not be the most efficient option. Understanding when to use AI, which model to use and when alternative approaches are more cost-effective is becoming a critical part of software delivery economics.
This isn't the first time engineering teams have had to rethink costs. Not that long ago, cloud computing pushed developers from broad infrastructure budgets toward more granular unit economics. AI is creating a similar shift.
A delivery model that actively ties AI spend to engineering and productivity outcomes will help organizations better measure their ROI. That starts with connecting AI activity directly to the work it produces. Token spend, prompts, sessions and generated code need to be automatically attributed to the developers, teams, repositories and business units responsible for shipping it. Without that level of traceability, AI use remains visible in isolation, but disconnected from outcomes.
The same level of automation is required in the processes engineering leaders use to identify inefficiency in their workflows. Organizations need to be able to see when AI spend is going toward code that never ships, when expensive models are being used where lighter ones would suffice, and when prompts and workflows are generating unnecessary cost without improving delivery. Without this insight, inefficiencies are only visible after they have already accumulated.
That visibility also needs to extend from code generation through to production. AI-generated work should be tracked end-to-end -- from prompt to pull request to deployment -- with delivery metrics such as ship rate, PR cycle time and DORA indicators correlated against incident and quality data. Only then can organizations understand whether AI is improving software delivery or introducing new forms of rework.
Across all of this, organizations need a consistent baseline for benchmarking adoption, efficiency and impact across teams. Without that, increases in AI use risk being interpreted as progress, even when they do not translate into improved performance.
When they can connect AI use to delivery outcomes, organizations start understanding value. A true cost model doesn’t just show where money is being spent -- it reveals which work is driving impact.
Organizations that get there first won’t just reduce AI spend, they’ll improve how they decide what to build in the first place. Governance and cost awareness become embedded in the workflow, making AI use context-aware -- applied when it improves delivery, avoided when it doesn’t, and balanced against simpler or cheaper alternatives that would drive better value. In that environment, AI stops being a default and becomes a tool used deliberately to ship software that makes a difference.

商业视角看AI

文章目录

📚 推荐阅读

扫描二维码，在手机上阅读