AI/LLM Billing Setup

If you ship an AI feature, your costs scale with every prompt — but your revenue usually doesn’t. This guide closes that gap. You’ll wire up a meter that counts tokens for you, attach it to a product, and let Macropay turn raw usage into invoices automatically. As your Merchant of Record, Macropay also collects and remits sales tax and VAT on every charge and absorbs PCI scope, so the only thing left for you to build is the AI itself. By the end you’ll have:

Automatic token metering — no manual counting, no usage cron jobs.
Usage-based invoices — customers pay for exactly what they consume.
Live margin visibility — billed revenue vs. upstream model cost, per customer.

Before you start

You’ll need:

A Macropay account with an organization.
An Organization Access Token for API calls.
A model provider key (OpenAI, Anthropic, etc.).

Build against the Sandbox first. It mirrors production but never moves real money, so you can replay token events freely while you tune pricing.

How the pieces fit together

Three objects do the work. Once they exist, your app only emits events:

Object	Role
Meter	Aggregates ingested events into a billable number per customer per period.
Metered price	Maps that number to money (e.g. `$0.02` per 1k tokens).
Product	The subscription customers sign up for; carries the metered price.

Install the SDK

npm install @macropayments/ingestion @macropayments/sdk ai @ai-sdk/openai

pnpm add @macropayments/ingestion @macropayments/sdk ai @ai-sdk/openai

pip install macropay pydantic-ai

Create a meter

In the dashboard, open Usage Based Billing → Meters and click Create Meter. For a support copilot that bills on tokens:

Name: copilot-tokens
Filter: events named copilot-tokens
Aggregation: sum over the totalTokens property

The meter now tallies every matching event, grouped by customer and billing period.

Attach a metered price

Go to Products, create a new subscription product (or edit one), and add a Metered Price:

Meter: copilot-tokens
Unit price: your rate per 1,000 tokens — say $0.02
Billing interval: Monthly

Anyone subscribed to this product is now billed on real consumption, settled at period end.

Wrap your model

The @macropayments/ingestion library tracks token counts off any @ai-sdk/* model and ships them to the meter for you — no manual event payloads.

import { Ingestion } from "@macropayments/ingestion";
import { LLMStrategy } from "@macropayments/ingestion/strategies/LLM";
import { openai } from "@ai-sdk/openai";

export const llmIngestion = Ingestion({
  accessToken: process.env.MACROPAY_ACCESS_TOKEN!,
})
  .strategy(new LLMStrategy(openai("gpt-4o")))
  .ingest("copilot-tokens");

import os
from macropay.ingestion import Ingestion
from macropay.ingestion.strategies import PydanticAIStrategy

ingestion = Ingestion(os.environ["MACROPAY_ACCESS_TOKEN"])
strategy = ingestion.strategy(PydanticAIStrategy, "copilot-tokens")

Call it from your route

Pass customerId so usage lands on the right account. Every generateText / streamText call emits an event automatically.

import { generateText } from "ai";
import { llmIngestion } from "@/lib/llm";

export async function POST(req: Request) {
  const { prompt, customerId }: { prompt: string; customerId: string } =
    await req.json();

  // Attribute every token to this customer's meter
  const model = llmIngestion.client({ customerId });

  const { text } = await generateText({
    model,
    system: "You are a customer-support copilot.",
    prompt,
  });

  return Response.json({ text });
}

from fastapi import FastAPI
from pydantic import BaseModel
from pydantic_ai import Agent
from llm import strategy

app = FastAPI()
agent = Agent("gpt-4o")

class ChatRequest(BaseModel):
    prompt: str
    customer_id: str

@app.post("/api/chat")
async def chat(req: ChatRequest):
    result = agent.run_sync(req.prompt)

    # Forward token usage to the meter for billing
    strategy.ingest(req.customer_id, result)

    return {"text": result.output}

Watch it land

Events show up within seconds. Open Usage Based Billing → Meters, pick copilot-tokens, and you’ll see per-customer totals and the running aggregate for the current period. Customers see their own projected charges in the customer portal — no support ticket required.

Billing runs itself

At the close of each period, Macropay:

Reads each customer’s aggregated meter value.
Multiplies it by your metered price.
Charges the payment method on file.
Issues a tax-compliant invoice as Merchant of Record.

That last step matters: because Macropay is the seller of record, the correct sales tax or VAT is calculated, collected, and remitted for you, and it’s our name on the customer’s statement — not yours. No billing cron, no tax engine to maintain.

Protect your margin

Token revenue only helps if it beats token cost. Attach a .cost() callback and Macropay records your upstream spend next to the billed amount:

src/lib/llm.ts

const llmIngestion = Ingestion({
  accessToken: process.env.MACROPAY_ACCESS_TOKEN!,
})
  .strategy(new LLMStrategy(openai("gpt-4o")))
  .cost((ctx) => ({
    // gpt-4o: $2.50 / 1M input, $10 / 1M output — amounts in cents
    amount: Math.ceil(ctx.inputTokens * 0.0025 + ctx.outputTokens * 0.01),
    currency: "usd",
  }))
  .ingest("copilot-tokens");

Open Cost Insights to see margin per customer at a glance, and catch the power user who’s quietly running you into the red.

Selling an autonomous agent rather than a chat box? Bill on outcomes — a resolved ticket, a booked meeting, a closed deal — instead of raw tokens. The Signals API ingests activity and outcome signals and emits value receipts that certify the ROI behind each charge.

Where to go next

Usage-based billing

The full model behind meters, events, and aggregation.

Ingestion strategies

Stream, S3, delta-time, and custom event sources.

Prepaid credits

Grant credits and draw them down against any meter.

Cost insights

Track upstream model spend and per-customer margin.

Bootup

Features

Integrate

Merchant of Record (MoR)

AI/LLM Billing Setup

Before you start

How the pieces fit together

Watch it land

Billing runs itself

Protect your margin

Where to go next

Usage-based billing

Ingestion strategies

Prepaid credits

Cost insights

​Before you start

​How the pieces fit together

​Watch it land

​Billing runs itself

​Protect your margin

​Where to go next

Usage-based billing

Ingestion strategies

Prepaid credits

Cost insights

Before you start

How the pieces fit together

Watch it land

Billing runs itself

Protect your margin

Where to go next