29.8k unique visitors in the last 3 days

From AI Chaos to Cost Control: Why Usage Data Is the Missing Link in Your AI Strategy

If you do not understand AI usage, you do not understand what you are paying for.

Do you know how much AI your business is consuming? It’s easy to lose track of hard costs and opportunity costs in the chaotic AI model market where capabilities are increasing, specialization is taking root, and cost spreads between models are expanding. Having clear visibility into which models are used, how much, and for what purpose will fast become a necessary business control for any firm consuming AI services.

“With AI monetization, you’ve got to balance the cost you transfer onto your customers. You want customers to use more AI, but you need to be fully aware of increasing costs in the back end,” said Stephen Hateley, Director, Partner and Alliance Marketing, for Digital Route.

Controlling these growing AI costs starts with having a clear view of one’s usage of AI models, whether by token or compute resource consumption.

AI consumers need to match model to function and cost

The opportunity for AI consumption costs to rise, and not to be optimized, is increasing. AI models that provide similar functionality are often priced at significantly different costs per token. (For businesses leasing compute rather than paying by token, AI models also consume substantially more or less compute resources based on their parameter size.)

Across providers, models vary in their strengths and capabilities. Overlaps in different models’ capabilities are common but specialization is also increasing. Because they come with large variations in costs, AI consumers and resellers need to manage and optimize their costs and usage across different AI models by understanding when, what, and how much they are using and why.

Comparing AI model costs

The table below examines 18 AI models that are popular with developers. Note how they vary significantly in cost per million input and output tokens. Generally, a developer consumes fewer input tokens — manifested as prompts, data files, audio, images and video, for example — than output tokens, which manifest as lines of code, text, images, audio, and video. You can click the table to open it in a new window.

Source: Google Gemini & respective provider documentation. Note: Input token costs normalized to Cache Miss values. Published volume discounts excluded.

On average, the cost per million output tokens for the models on this table is nearly 6 times that of its input tokens. Across models the difference in token consumption costs can differ by 30x or more.

Watch out for costlier models with less capability

Many online reviews and benchmarks compare Moonshot’s Kimi K2 against Alibaba’s Qwen 3, OpenAI’s GPT 4.1, xAI’s Grok 4, Deepseek V3, and Anthropic Claude 4 Sonnet. Developers debate the relative strengths and weaknesses of each. The spread from low to high costs spans a 30x multiple, ranging from $.50 per million output tokens for Grok 4 Fast to $15.00 for Claude 4 Sonnet.

Narrowing focus, Kimi K2’s capabilities have been compared favorably to Open AI’s GPT5 Codex. Though the two models have different strengths — Kimi K2 may be better for agentic multi-step reasoning and tool orchestration whereas GPT-5 Codex may be superior in raw coding performance, the difference in cost per million output tokens is a factor of 4 (Kimi K2 at $2.50 versus GPT5 Codex at $10.00).

Arguably a business could pay substantially more for what is considered by some to be a less capable model. Such is the chaos inherent to the fast-moving AI model market.

Optimize cost within a single provider

Even within a single provider’s models, like OpenAI, the fit of purpose to model is important regarding cost. OpenAI’s GPT‑5 mini is a highly capable model for developers. At $2.00 per million output tokens, it seems like a bargain versus the $10.00 per million output tokens for its big brother, GPT‑5 standard.

GPT‑5 standard offers more in terms of complex reasoning and agentic tasks. But if an organization sticks to GPT‑5 due to an arbitrary policy and eschews using GPT‑5 mini where it’s lighter-weight capabilities are appropriate, then $8.00 per million output tokens will be spent for no good reason.

Knowing which AI models are being used, how they are being used, who is using it, what they cost, and what it will cost as usage scales, really matters.

Cut through the chaos with a clear view of usage data

The dual trends of digital service providers of all types offering more services with usage-based monetization models and incorporation of AI into their services is driving more demand for usage data management to control and optimize consumption costs.

“In the last two years, we have seen an explosion in the need for usage data management because of AI,” said Hateley, adding, “we are seeing changes in both token and GPU usage.”

For enterprises consuming and monetizing AI, not only is cost control dependent on usage data management, but also all the internal reporting, cost accounting, and strategic decision-making that goes with it. “You have data coming in on the usage side of IT services, you have bills coming in, and then you need to be able to report back on revenue recognition, reconciliation, and partner settlement,” Hateley explained.

Most monetization models rely on usage measurement

Even if an organization isn’t billing its customers on a usage basis, visibility into usage is necessary to support data-driven decision making across the business. “Different business models like subscription, flat rate, tokenized, or outcome-based are all dependent on usage data,” Hateley said.

  • Hateley explained, if a business is charging on a usage basis, then “it’s just usage-based billing. It’s collected, processed, metered and billed.”
  • If the business is monetizing through an outcome-based model, usage visibility remains necessary. “What’s it based on, an SLA? You still need the usage data to give evidence of the usage that is part of the outcome process,” Hateley explained.
  • For subscription billers, the rule still applies. “You want to understand what customers are using in terms of features and when they are using them. That feeds their data-driven decision making,” Hateley said.

Usage data management helps solve the problem

The ability to collect, visualize, and normalize accurate usage data in real-time is becoming a must-have capability for businesses consuming and reselling AI services at volume. This applies to anyone from gaming and video streaming providers to cloud service providers, manufacturing, transportation, logistics, and even e-charging electric vehicles, Hateley explained.

Usage data provides the foundational source for how both internal users and customers are consuming services, whether underpinned by AI or not. If AI is under the covers, gaining accurate and detailed visibility into which AI models are being used for what purposes also becomes a requirement.

“Usage data tells you everything about how customers are consuming services. That data needs to be treated in a specific way. If you want to use it for financial teams, you need full visibility into where that data comes from to bill and audit accurately,” said Hateley.

Accuracy is crucial to the process. Businesses must “make sure they aren’t dropping any data. Dropped data is missed revenue and that’s unacceptable,” Hateley said.

Usage data is also needed beyond finance and billing. “That data can serve operations teams” and “product management can use that data to understand who uses services, how they are being used, and then tailor specific bundles and services, answering ‘what’s the next best product we can give these people’ based on insights from that usage data. It comes down to the usage data giving you such rich information,” explained Hateley.

Digital Route focuses on meeting these usage data needs

Gaining a clear, accurate view of usage data requires a platform specialized to the purpose. This is Digital Route’s specialty. The company has a long history of collecting, cleansing, normalizing and delivering complex usage data to support organization-wide needs for businesses.

This includes the company’s decades-long roots in billing mediation for telecom operators. These network operators and their customers generate massive data volumes, face stringent compliance requirements, and need granular views into their real-time data for everything from billing and reporting to troubleshooting and customer experience. Data accuracy and completeness are critical to these tasks.

Similar needs have now crossed far beyond telco lines. Digital Route’s latest launch, UsageCloud™, combines its deep expertise in telecom networks with its advanced methods in collecting and managing enterprise usage data to meet the increasing demand and need for access to it.

Monitor and assure data in real-time

With more businesses adopting digital services and usage-based, or usage-sensitive, business models, mastering real-time usage data has become “a critical step in financial processes,” for a much broader range of businesses, said Hateley.

These businesses “need to aggregate and pull massive volumes together and do it at speed,” Hateley explained. This data is not coming from a data lake, rather real-time data is collected such that any business must be “spot on with your collection, correction and aggregation. If you don’t address those problems in mediation, you are just passing problems down to other systems and processes,” Hateley said.

Letting a lack of visibility into real-time usage data persist is problematic for businesses that generate, consume, and pay for their data volumes in various ways. It’s an area that can generate waste and propagate cost and decision-making problems across organizations.

While many organizations may assume they have their real-time data flows under control, “it is surprising to hear how many don’t know what they don’t know,” Hateley noted. “We ask how well they know their usage, and they don’t know who is consuming what and when,” explained Hateley.

“Any scenario where there is usage and consumption data being collected, you need to know who uses what, where, when, why, and what it costs. That’s what mediation is for. It’s not just billing mediation, it’s all these other use cases,” Hateley said.

No company wants its best day in business to be its last. In the AI era, with the propensity for token and compute consumption to explode, this grows as a possibility. As more specialized models evolve and proliferate, the need for accurate, real-time usage data to bill, price, settle with partners, institute business controls and support data-driven decisions at all levels of the business is fast becoming a common requirement.

Edward Finegold
Edward Finegold
Ed is an independent telco business strategist focused on monetization, customer experience and business support systems. At different times Ed has been a contributing research analyst with the TM Forum, Director of Content Strategy for Netcracker, Chief Sales Officer at Validas, and Editor in Chief at Billing World and OSS Today.

Related Articles

The Commsrisk Global Fraud Dashboard


Our Global Fraud Dashboard uses AI-powered search to collate, update and visualize data about scams and other network abuses from around the world. New charts are added each month. See it here.

Get Our Weekly Newsletter by Email