Free Sandbox
1,000 calls/day · 10 req/s burst
Hard rate limit. Requests above the cap get 429 until the daily window resets. No overage charges.
Best for: Developer evaluation and early agent integrations.
Autonomous frameworks like OpenClaw generate 10–100× the traffic of human users—continuously, at enterprise scale. UsageTap wraps your LLM calls so every request is entitlement-checked, metered, rate-limited, and billed. One line of SDK code. Zero custom billing infrastructure.
10–100×
more requests
Agentic workloads generate orders-of-magnitude more API calls than human-driven sessions.
$0.03→$3
per interaction
When your backend fans out to paid LLM inference, margin can erode to zero at agent scale.
∞ burst
risk
Without per-tier rate limits, a single runaway agent can consume your entire capacity.
One line of code. wrapOpenAI or wrapFetch intercepts every vendor call, checks entitlements, selects the right model, and records usage—automatically.
Configure rate limits, token budgets, model access, and tool permissions per plan in the UsageTap dashboard. No code changes when you adjust.
Usage flows to Stripe in real time. Forecasting flags anomalies before overages surprise your customers. You keep healthy unit economics at any traffic volume.
wrapOpenAI gives you the most power in the fewest lines—entitlement gating, model selection, token metering, and billing all happen automatically. wrapFetch does the same at the transport layer for any OpenAI-compatible endpoint. Or go fully explicit with raw HTTP.
import OpenAI from "openai";
import { UsageTapClient } from "@usagetap/sdk";
import { wrapOpenAI } from "@usagetap/sdk/openai";
const usageTap = new UsageTapClient({
apiKey: process.env.USAGETAP_API_KEY!,
baseUrl: process.env.USAGETAP_BASE_URL!,
});
const openai = wrapOpenAI(new OpenAI(), usageTap, {
defaultContext: {
customerId: agent.ownerId,
feature: "agent.run",
requested: { premium: true, search: true },
tags: ["agentic", agent.name],
},
});
// Every call is now automatically:
// ✓ Entitlement-checked (premium vs standard, tool access)
// ✓ Model-selected based on the customer's tier
// ✓ Token-metered (input, output, reasoning, cached)
// ✓ Rate-limited per plan policy
// ✓ Billed to Stripe
const response = await openai.responses.create({
model: "gpt-5", // downgraded to gpt-5-mini if tier requires
input: agentTask.prompt,
tools: [{ type: "web_search" }], // gated: only allowed tiers
});
// That's it. Usage is already recorded. No call_begin/call_end needed.Without UsageTap
With UsageTap
Combine pricing with throttling policy so agent traffic is both billable and operationally safe. Configure these in the dashboard—no redeployment.
Free Sandbox
1,000 calls/day · 10 req/s burst
Hard rate limit. Requests above the cap get 429 until the daily window resets. No overage charges.
Best for: Developer evaluation and early agent integrations.
Prepaid Credits
$500 → 25M agent calls
Agents consume prepaid credits first. Account-level throttle protects capacity while budget burns down predictably.
Best for: Procurement-led teams that need committed budget before production.
Pure PAYG
$0 commit · per-call metering
Charge for successful calls only, metered monthly. Tiered rate limits by API key prevent unbounded burst.
Best for: Uncertain traffic profiles from autonomous agent workflows.
Hybrid: Prepaid + PAYG
$2,000 includes 100M calls + overage
Overage rate drops after the prepaid threshold so high-volume agent customers keep scaling while your unit economics stay healthy.
Best for: Enterprise agent deployments with baseline + spike demand.