为什么结果计费对AI代理有意义
Why outcome-billing makes sense for AI Agents

原始链接: https://www.valmi.io/blog/an-imperative-for-ai-agents-outcome-billing-with-valmi/

## AI 代理的价值与基于结果的计费 AI 代理正在证明其具有惊人的*实用性*——即使它们的“智能”程度有待商榷——通过释放显著的自动化和效率提升。其核心价值主张在于可衡量的*成果*,例如增加的支持工单处理量或成功招聘的人数,这些成果能直接转化为企业价值(对于一家 1 亿美元的公司,可能高达数百万美元)。 然而,传统的 SaaS 计费模式无法捕捉这种价值。遗留系统难以应对 AI 的高额且线性增长的成本(例如 LLM 的使用),并且无法根据其交付的*人类等价价值*进行定价。按席位定价尤其成问题,因为 AI 减少了对人工的需求。 这正是**基于结果的计费**变得至关重要的原因。开发者应专注于衡量结果(已解决的工单、成功招聘等)并跟踪相关成本,以了解盈利能力。像 **Valmi** 这样的工具旨在统一这些数据点,从而实现准确的定价并向客户展示投资回报率。 基于结果的计费还解决了 AI 本身固有的不可靠性;开发者承担失败的风险,仅对成功的成果收费,从而更容易让买家接受,因为他们需要价值证明。最终,向基于结果的模型转变对于可持续增长以及准确反映 AI 代理提供的价值至关重要。

## AI 代理的成果计费:摘要 最近的 Hacker News 讨论集中在“成果计费”对于 AI 代理的可行性上——根据成功的结果而非花费的时间来收费。作者认为这是一种简单、对客户友好的定价模式,以一个按找到的唯一记录收费的网页搜索产品为例。 然而,评论者们提出了关于定义和客观衡量“成功”的担忧。核心问题在于 **古德哈特法则**:当一个指标成为目标时,它就停止成为一个好的指标。印度眼镜蛇奖励计划等例子表明,激励结果可能会导致钻空子。 许多人认为,验证 AI 输出需要进一步的 LLM(它们本身也可能出错)或最终需要人工监督——可能*增加*工作量。 还有人指出,客户有动机低报成功的成果。 虽然对于添加员工或解决一定比例的支持工单等简单任务是可行的,但讨论表明,成果计费在成果明确定义且易于量化,并且双方都同意的情况下最有效。
相关文章

原文

Hey AI Agent Developer - yes, you. What is the value you create? Let's find out.

The Skeptic's Take: Useful vs. Intelligent

On the spectrum of "Is AI intelligent and able to reason?", I fall on the skeptical side. But is it useful? Absolutely. We have built a few AI workflows, and they have unlocked superior automation. Evidently, the world is currently witnessing the transformative potential of AI agents in customer support and coding, to some degree.

Take customer support, for instance. If AI agents help each support employee handle 30% more tickets, that's like adding 30 new hires to a 100-person team, without the cost. For a $100M company, this efficiency gain could translate to $20-30M in additional enterprise value (EV), under sweeping assumptions. The difference in EV between pre- and post-AI agents is the Value you create. Alternatively, one could also use operating income for a flow perspective.

Two Critical Questions

'Digital worker' is, in fact, a fitting neologism for the AI agent. I take the view here that human workers, together with digital workers, can do more, creating abundance. As AI agent developers, we should be asking:

  • What is our share in the value we create, and how should we price AI agents?
  • Why do legacy billing systems prove inadequate?

Outcomes Matter. Internals Don't.

There are two parts to our first question: First, what an AI agent or a workflow does internally, for instance, the number of reasoning steps involved, is irrelevant. However, measuring outcomes such as the number of support tickets resolved or the number of successful hires is directly proportional to the value we want to measure. Second, what share do we take? Simply going by the 10x rule of SaaS pricing guidelines, we can claim one-tenth of what we measured. Thus, measuring outcomes becomes imperative. We have built Valmi to ingest outcomes as the first step. The added complexity that Valmi addresses is where legacy billing systems sputter.

Why Legacy Billing Systems Don't Cut It

Let's start with the cost argument. Building AI systems is expensive. There are ineffective paths in workflows and autonomous agents that simply fail. LLM costs hit COGS linearly, whereas the marginal cost in traditional SaaS systems is negligible. SaaS generates ~70% gross margins. While cost is an important element in the case of AI agents, legacy systems, such as Stripe and Zuora, are inadequate to capture it.

Secondly, the pricing model. The seat-based model works against selling AI agents, since the seat-based model is under attack from two forces. One is the decrease in the number of human workers required. By pricing against seats, you are setting your AI agents up for decline, not for growth. The other is worse: It does not capture the human-equivalent value, as discussed above. Even the usage-based models that legacy systems support do not distinguish between activity and outcomes.

Do It Yourself? No Need.

To effectively price AI agents, we need to measure outcomes, and observe and allocate costs. To evaluate margin contraction, we also need to understand where outcomes and costs diverge, including in aggregate, for which agents and for which customer instances. Therefore, the top AI agent developers who use outcome-billing, such as Harvey, Sierra and Usepropane, put together two sources: one to track cost information and the other to measure value. And they unified these two sources of information in a single system. Valmi helps you do that very easily. You can bring up the whole billing infrastructure on your premises. We have made it available under a permissive license. No need to do it yourself.

The Unreliability Problem (And How to Sell Despite It)

Outcome measurement serves a greater purpose. AI is unreliable and not deterministic. We simply do not understand when it succeeds and when it fails. Consequently, it is incumbent upon the buyer of your AI agents to demand proof of value. It is simply easier for you to convince and onboard buyers if you switch to outcome-billing. Why? You assume the risk of failure of your AI agents. But, the buyer simply pays for agent's performance and pays nothing for its failure. To prove the value and help you convince your buyers, Valmi supports customer dashboards that show outcomes such as ticket resolution percentage. Your buyers can view and embed these dashboards in their workflows.

The Bottom Line

To sum it up, we have built Valmi, and made available open-source SDKs and free to deploy packages, to address the imperative for your AI agents. It solves outcome-billing that provides proof of value and cost tracking that exposes margin contraction. It lets you simulate and set prices for your AI agents and quickly onboard your customers. We understand hybrid models, combining seat-based, activity and outcomes, will be required in the transitory phase of billing models. Valmi supports all of these as well. Try it out for free or deploy it within your premises.

The link has been copied!

联系我们 contact @ memedata.com