Quickstart Guide
Add usage tracking to your AI app in under 10 minutes.
1 Get your API key
Create an API key in Configure and keep it server-side only.
Security Notice
Never expose USAGETAP_API_KEY
in the browser or commit it to git.
2 Install the SDK
Install the package with your preferred manager:
npm install @usagetap/sdk
# or
pnpm add @usagetap/sdk
# or
yarn add @usagetap/sdk
SDK enforces server-only usage (throws if imported client-side).
3 Configure environment variables
Add a .env.local
file:
# .env.local
USAGETAP_API_KEY=ut_example_server_key
OPENAI_API_KEY=sk-proj-example-openai-key
See .env.example for more options.
4 Add tracking
Pick one approach below:
Option A: withUsage + entitlements (recommended)
Call UsageTap first, then choose the OpenAI model, reasoning effort, and search tools from the returned allowances.
// app/api/chat/route.ts
import OpenAI from 'openai';
import { UsageTapClient } from '@usagetap/sdk';
const usageTap = new UsageTapClient({
apiKey: process.env.USAGETAP_API_KEY!,
baseUrl: process.env.USAGETAP_BASE_URL!,
});
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY! });
function selectCapabilities(allowed: {
standard?: boolean;
premium?: boolean;
reasoningLevel?: 'LOW' | 'MEDIUM' | 'HIGH' | null;
search?: boolean;
}) {
const tier = allowed.premium ? 'premium' : 'standard';
const model = tier === 'premium' ? 'gpt5' : 'gpt5-mini';
const reasoningEffort =
allowed.reasoningLevel === 'HIGH'
? 'high'
: allowed.reasoningLevel === 'MEDIUM'
? 'medium'
: allowed.reasoningLevel === 'LOW'
? 'low'
: undefined;
return {
model,
reasoning: reasoningEffort ? { effort: reasoningEffort } : undefined,
tools: allowed.search ? [{ type: 'web_search' as const }] : undefined,
};
}
export async function POST(req: Request) {
const { messages } = await req.json();
const prompt = messages
.map((m: { role: string; content: string }) => `${m.role{'}'}: ${m.content{'}'}`)
.join('
');
const completion = await usageTap.withUsage(
{
customerId: 'quickstart-user',
feature: 'chat.send',
requested: { standard: true, premium: true, search: true, reasoningLevel: 'HIGH' },
},
async ({ begin, setUsage }) => {
const { model, reasoning, tools } = selectCapabilities(begin.data.allowed);
const response = await openai.responses.create({
model,
input: prompt,
reasoning,
tools,
});
setUsage({
modelUsed: model,
inputTokens: response.usage?.input_tokens ?? response.usage?.prompt_tokens ?? 0,
responseTokens: response.usage?.output_tokens ?? response.usage?.completion_tokens ?? 0,
reasoningTokens: reasoning ? response.usage?.reasoning_tokens ?? 0 : 0,
searches: tools?.length ? response.usage?.web_search_queries ?? 0 : 0,
});
return response;
},
);
return new Response(completion.output_text ?? '', {
headers: { 'Content-Type': 'text/plain' },
});
}
Option B: wrapOpenAI (zero-boilerplate)
wrapOpenAI automatically applies entitlement-aware defaults. It picks gpt5 (premium) or gpt5-mini (standard) based on begin.data.allowed.
// app/api/chat/route.ts - Minimal wrapOpenAI
import OpenAI from 'openai';
import { UsageTapClient, wrapOpenAI } from '@usagetap/sdk';
const usageTap = new UsageTapClient({
apiKey: process.env.USAGETAP_API_KEY!,
baseUrl: process.env.USAGETAP_BASE_URL!,
});
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY! });
const ai = wrapOpenAI(openai, usageTap, {
defaultContext: {
customerId: 'quickstart-user',
feature: 'chat.send',
requested: { standard: true, premium: true, search: true, reasoningLevel: 'HIGH' },
},
});
export async function POST(req: Request) {
const { messages } = await req.json();
// wrapOpenAI automatically selects gpt5 (premium) or gpt5-mini (standard)
// based on begin.data.allowed
const completion = await ai.chat.completions.create(
{ messages },
{
usageTap: {
requested: { standard: true, premium: true, search: true, reasoningLevel: 'MEDIUM' },
},
},
);
return new Response(completion.choices[0]?.message?.content ?? '', {
headers: { 'Content-Type': 'text/plain' },
});
}
quickstart-user
with a real authenticated user id.5 Test locally
Start dev server and send a test request:
curl -X POST http://localhost:3000/api/chat
-H "Content-Type: application/json"
-d '{"messages":[{"role":"user","content":"Hello!"}]}'
Look for a streaming response and UsageTap logging.
Verify in dashboard
- Real-time token & cost metrics
- Per-customer feature usage
- Quota and plan attribution
- Error vs success breakdown
Success!
You're now tracking AI usage with UsageTap.
Next steps
📚 SDK Documentation
Explore wrapOpenAI, Express middleware, React hooks, streaming helpers, checkUsage(), and advanced overrides.
🤖 LLM Prompt Kits
Copy-paste instructions for Cursor / Copilot to scaffold integrations fast.
📊 Usage Plans
Configure quota tiers and feature flags for customers.
💬 Get Help
Questions? Reach out via support or community.
Check usage without creating a call
Need to display current quota status or plan details without tracking a vendor call? Use checkUsage()
:
const usageStatus = await usageTap.checkUsage({ customerId: "cust_123" });
console.log("Meters:", usageStatus.data.meters);
console.log("Allowed:", usageStatus.data.allowed);
console.log("Plan:", usageStatus.data.plan);
console.log("Balances:", usageStatus.data.balances);
Returns the same rich snapshot as call_begin
(meters, entitlements, subscription details, plan info, balances) but without creating a call record. Perfect for dashboard widgets, pre-flight checks, or displaying quota status.
`call_begin` envelope (live response)
UsageTap now responds with the canonical { result, data, correlationId } envelope. The SDK sends the required Accept: application/vnd.usagetap.v1+json
header for you:
{
"result": {
"status": "ACCEPTED",
"code": "CALL_BEGIN_SUCCESS",
"timestamp": "2025-10-04T18:21:37.482Z"
},
"data": {
"callId": "call_123",
"startTime": "2025-10-04T18:21:37.482Z",
"policy": "DOWNGRADE",
"newCustomer": false,
"canceled": false,
"allowed": {
"standard": true,
"premium": true,
"audio": false,
"image": false,
"search": true,
"reasoningLevel": "MEDIUM"
},
"entitlementHints": {
"suggestedModelTier": "standard",
"reasoningLevel": "MEDIUM",
"policy": "DOWNGRADE",
"downgrade": {
"reason": "PREMIUM_QUOTA_EXHAUSTED",
"fallbackTier": "standard"
}
},
"meters": {
"standardCalls": {
"remaining": 12,
"limit": 20,
"used": 8,
"unlimited": false,
"ratio": 0.6
},
"premiumCalls": {
"remaining": null,
"limit": null,
"used": null,
"unlimited": true,
"ratio": null
},
"standardTokens": {
"remaining": 800,
"limit": 1000,
"used": 200,
"unlimited": false,
"ratio": 0.8
}
},
"remainingRatios": {
"standardCalls": 0.6,
"standardTokens": 0.8
},
"subscription": {
"id": "sub_123",
"usagePlanVersionId": "plan_2025_01",
"planName": "Pro",
"planVersion": "2025-01",
"limitType": "DOWNGRADE",
"reasoningLevel": "MEDIUM",
"lastReplenishedAt": "2025-10-04T00:00:00.000Z",
"nextReplenishAt": "2025-11-04T00:00:00.000Z",
"subscriptionVersion": 14
},
"models": {
"standard": ["gpt5-mini"],
"premium": ["gpt5"]
},
"idempotency": {
"key": "call_123",
"source": "derived"
}
},
"correlationId": "corr_abc123"
}
The envelope now includes richer metadata: entitlementHints
summarizes the recommended model tier and downgrade rationale, meters
provides per-counter snapshots with remaining quotas and ratios, remainingRatios
offers compact lookups, subscription
contains plan identity and replenishment timestamps, and models
surfaces vendor hints (standard vs premium model shortlists). data.idempotency.key
always matches callId
. When you omit idempotency
in the request, the backend derives a deterministic hash from organization, customer, feature, and requested entitlements.
Unified `/call` shortcut
Prefer a one-shot API flow without the SDK? Call the public `/call` endpoint: UsageTap runs begin → optional vendor call → end for you and returns both envelopes in one response.
const baseUrl = process.env.USAGETAP_BASE_URL ?? "https://api.usagetap.com";
const res = await fetch(`${baseUrl}/call`, {
method: 'POST',
headers: {
Authorization: 'Bearer ' + process.env.USAGETAP_API_KEY,
Accept: 'application/vnd.usagetap.v1+json',
'Content-Type': 'application/json',
},
body: JSON.stringify({
customerId: 'cust_demo',
requested: { standard: true },
feature: 'chat.completions',
vendor: {
url: 'https://api.openai.com/v1/chat/completions',
method: 'POST',
headers: {
Authorization: 'Bearer ' + process.env.OPENAI_API_KEY,
'Content-Type': 'application/json',
},
body: {
model: 'gpt-4o-mini',
messages: [{ role: 'user', content: 'Hi!' }],
},
responseType: 'json',
},
}),
});
const envelope = await res.json();
if (envelope.result.code !== 'CALL_SUCCESS') {
console.error('UsageTap call failed', envelope);
}
Omit the vendor
block if you just want begin → end with your own usage numbers. Vendor errors downgrade to `CALL_VENDOR_WARNING`, but UsageTap still records the end-of-call telemetry so quotas stay accurate.
`call_end` success response
When you finalize usage, the envelope mirrors the same structure and surfaces a `metered` summary derived from the Dynamo counters:
{
"result": {
"status": "ACCEPTED",
"code": "CALL_END_SUCCESS",
"timestamp": "2025-10-04T18:21:52.103Z"
},
"data": {
"callId": "call_123",
"costUSD": 0,
"metered": {
"tokens": 768,
"calls": 1,
"searches": 1
}
},
"correlationId": "corr_abc123"
}
metered
is derived from the raw Dynamo deltas and reports the amounts consumed. Additional meters (audio seconds, reasoning tokens, balances) will populate in later phases without breaking the contract. BLOCK policy violations still return HTTP 429 with an error envelope.
Premium detection: UsageTap automatically classifies calls as premium when the model's output token price exceeds $4 per million. You can override this by passing isPremium: true
or isPremium: false
in your call_end
request.