The AI Billing Problem
You have built an AI product. Users send prompts, your backend calls an LLM, and you return results. Now you need to charge for it. This sounds simple until you confront the reality:- Token counts vary wildly between requests (a simple question might use 200 tokens; a document analysis might use 50,000)
- Costs differ by model (GPT-4o vs Claude vs Llama 3 have very different per-token pricing)
- Cached tokens cost less than fresh tokens, and you need to track both
- Customers expect transparency about what they are being charged for
- You need margins but pricing too high loses customers; pricing too low loses money
Approach 1: DIY Token Metering
The brute-force approach is to build your own usage tracking system.What You Need to Build
- A database table to store every usage event (customer, tokens, model, cost)
- An aggregation layer to compute per-customer usage per billing period
- A pricing engine to apply your margin and compute the charge
- Integration with a payment processor to actually collect payment
- A customer-facing dashboard showing usage and estimated charges
The DIY Trap
Most teams that go this route discover three painful realities:-
Token counts from different providers use different schemas. OpenAI returns
usage.prompt_tokensandusage.completion_tokens. Anthropic returnsusage.input_tokensandusage.output_tokens. Google returnsusageMetadata.promptTokenCount. You end up building a normalization layer. - Billing disputes require audit trails. When a customer questions a charge, you need to show them exactly which requests generated which costs. This means storing detailed logs, not just aggregates.
- Cost tracking is separate from billing. You need to know your own costs (what you paid the LLM provider) separately from what you charge customers. This requires a second layer of accounting.
Approach 2: Managed Usage-Based Billing with Macropay
Macropay provides a complete usage-based billing infrastructure purpose-built for AI applications. Instead of building the entire stack yourself, you integrate the@macropay/ingestion SDK and let Macropay handle metering, aggregation, and billing.
Step 1: Install the SDK
Step 2: Set Up the LLM Ingestion Strategy
The SDK wraps your LLM calls and automatically captures token usage, model information, and cost data. TypeScript (Vercel AI SDK)- Input and output token counts
- Cached token counts
- Model name and provider
- Your cost per request
- Customer attribution
Step 3: Create a Meter in the Macropay Dashboard
Meters aggregate your raw usage events into billable quantities. For AI billing, you typically create a meter that sumstotalTokens from your ingestion events.
Go to Usage Based Billing in your dashboard
Navigate to the Meters section and click “Create Meter.”
Configure the meter
- Name: “LLM Tokens”
- Filter: Events named
llm-usage - Aggregation: Sum of
metadata.totalTokens
Step 4: Customers See Usage in Real Time
Customers can view their current usage and estimated charges in the Customer Portal, updated in real time as events are ingested.Margin Management Strategies
Knowing what to charge is as important as knowing how to charge. Here are the most common margin strategies for AI products.Flat Percentage Markup
The simplest approach: add a fixed percentage to your LLM costs.| Your Cost (per 1M tokens) | Markup | Customer Price | Your Margin |
|---|---|---|---|
| $2.50 (GPT-4o input) | 100% | $5.00 | $2.50 |
| $10.00 (GPT-4o output) | 100% | $20.00 | $10.00 |
| $0.15 (GPT-4o-mini input) | 200% | $0.45 | $0.30 |
Tiered Pricing
Charge less per token at higher volumes to incentivize usage.| Tier | Tokens/Month | Price per 1K Tokens |
|---|---|---|
| Starter | 0 - 100K | $0.02 |
| Growth | 100K - 1M | $0.015 |
| Scale | 1M - 10M | $0.01 |
| Enterprise | 10M+ | Custom |
Credit-Based Model
Sell credits upfront and deduct as customers use tokens. This provides predictable revenue and simpler customer-facing pricing.Real-World Example: Building an AI Writing Tool
Let us put it all together with a concrete example. Product: An AI writing assistant that helps users draft blog posts, emails, and marketing copy. Pricing model: Monthly subscription with included credits plus overage billing.Product Setup in Macropay
Plan 1: Writer ($29/mo)- Includes 50,000 tokens/month (Meter Credits benefit)
- Overage: $0.02 per 1,000 tokens
- Includes 250,000 tokens/month
- Overage: $0.015 per 1,000 tokens
- 5 team seats included
- Includes 1,000,000 tokens/month
- Overage: $0.01 per 1,000 tokens
- Unlimited team seats
Implementation
- Tracking token usage per customer
- Applying included credits from their subscription tier
- Calculating overage charges at the correct tier rate
- Generating invoices at the end of each billing period
- Collecting payment automatically
- Showing customers their usage in the Customer Portal
Cost Insights: Know Your Margins
Macropay’s Cost Insights feature lets you track your upstream LLM costs alongside customer revenue. This gives you real-time visibility into:- Gross margin per customer: Are your heaviest users still profitable?
- Cost trends over time: Are upstream price changes eroding your margins?
- Model efficiency: Which models deliver the best margin for your use case?
Getting Started
Create Your Account
Sign up and configure your first metered product in under 10 minutes.
Usage Based Billing Docs
Detailed documentation on events, meters, and metered pricing.
LLM Ingestion Strategy
SDK reference for the LLM ingestion strategy.
Cost Insights
Track your upstream costs and monitor margins in real time.