UsageTap Integration Guide for AI Coding Tools
Quick Start for LLM Agents: This guide provides everything needed to integrate UsageTap into applications in a single pass. Copy the relevant code snippet for your stack, replace the placeholders, and you're done.
Table of Contents
- Overview
- Installation
- One-Shot Integration Examples
- Idempotency Best Practices
- API Reference
- Error Handling
Overview
UsageTap tracks LLM API usage and enforces quotas automatically. The flow is simple:
- Begin Call: Request entitlements for a customer/feature
- Execute LLM Call: Use the allowed capabilities from step 1
- End Call: Report actual usage back to UsageTap
The SDK handles retries, idempotency, and automatic usage tracking for you.
Installation
npm install @usagetap/sdk openai
Environment Variables Required:
USAGETAP_API_KEY=your_api_key_here
USAGETAP_BASE_URL=https://api.usagetap.com/
OPENAI_API_KEY=sk-...
One-Shot Integration Examples
Next.js App Router
File: app/api/chat/route.ts
import { NextRequest } from "next/server";
import OpenAI from "openai";
import { UsageTapClient } from "@usagetap/sdk";
import { wrapOpenAI, toNextResponse } from "@usagetap/sdk/openai";
// Initialize clients
const usageTap = new UsageTapClient({
apiKey: process.env.USAGETAP_API_KEY!,
baseUrl: process.env.USAGETAP_BASE_URL!,
});
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY!,
});
// Wrap OpenAI client with UsageTap tracking
const ai = wrapOpenAI(openai, usageTap, {
defaultContext: {
// Set defaults that apply to all calls
feature: "chat.completions",
requested: {
standard: true,
premium: true,
search: true,
reasoningLevel: "HIGH",
},
},
});
export async function POST(req: NextRequest) {
try {
// Get user ID from your auth system
const userId = req.headers.get("x-user-id") || "anonymous";
const { messages } = await req.json();
// Make streaming LLM call with automatic usage tracking
const stream = await ai.chat.completions.create(
{
messages,
stream: true,
// model is optional - UsageTap selects based on entitlements
},
{
usageTap: {
customerId: userId,
// Generate unique idempotency key for this request
idempotencyKey: crypto.randomUUID(),
},
}
);
// Return streaming response
return toNextResponse(stream, { mode: "text" });
} catch (error) {
console.error("Chat error:", error);
return new Response(
JSON.stringify({ error: "Failed to process chat request" }),
{ status: 500, headers: { "Content-Type": "application/json" } }
);
}
}
Key Features:
- ✅ Automatic model selection based on customer's plan
- ✅ Usage tracking with zero boilerplate
- ✅ Idempotent requests with
idempotencyKey - ✅ Streaming responses
Express.js Server
File: server.ts
import express from "express";
import OpenAI from "openai";
import { UsageTapClient } from "@usagetap/sdk";
import { withUsage } from "@usagetap/sdk/express";
const app = express();
app.use(express.json());
// Initialize UsageTap client
const usageTap = new UsageTapClient({
apiKey: process.env.USAGETAP_API_KEY!,
baseUrl: process.env.USAGETAP_BASE_URL!,
});
// Attach UsageTap context to all requests
app.use(
withUsage(usageTap, (req) => {
// Extract customer ID from your auth middleware
return req.user?.id || "anonymous";
})
);
// Chat endpoint with automatic tracking
app.post("/api/chat", async (req, res) => {
try {
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY! });
// Get UsageTap-wrapped OpenAI client from request
const ai = req.usageTap!.openai(openai, {
feature: "chat.assistant",
requested: {
standard: true,
premium: true,
search: true,
reasoningLevel: "HIGH",
},
});
const stream = await ai.chat.completions.create(
{
messages: req.body.messages,
stream: true,
},
{
usageTap: {
// Auto-generate idempotency key
idempotencyKey: crypto.randomUUID(),
},
}
);
// Pipe stream to response and finalize usage automatically
req.usageTap!.pipeToResponse(stream, res);
} catch (error) {
console.error("Chat error:", error);
res.status(500).json({ error: "Failed to process chat request" });
}
});
app.listen(3000, () => {
console.log("Server running on http://localhost:3000");
});
Key Features:
- ✅ Middleware attaches UsageTap to every request
- ✅ Extract customer ID once from auth system
- ✅ Automatic usage tracking and streaming
Node.js Script
File: generate-summary.ts
import OpenAI from "openai";
import { UsageTapClient } from "@usagetap/sdk";
import crypto from "crypto";
// Initialize clients
const usageTap = new UsageTapClient({
apiKey: process.env.USAGETAP_API_KEY!,
baseUrl: process.env.USAGETAP_BASE_URL!,
});
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY!,
});
async function generateSummary(customerId: string, text: string) {
// High-level helper that handles begin → call → end automatically
const completion = await usageTap.withUsage(
{
customerId,
feature: "summarization",
requested: {
standard: true,
premium: true,
search: false,
reasoningLevel: "LOW",
},
// Generate idempotency key for safe retries
idempotencyKey: crypto.randomUUID(),
},
async ({ begin, setUsage }) => {
// Select model based on what customer is allowed
const model = begin.data.allowed.premium ? "gpt-4o" : "gpt-4o-mini";
// Make LLM call
const response = await openai.chat.completions.create({
model,
messages: [
{
role: "system",
content: "Summarize the following text concisely.",
},
{
role: "user",
content: text,
},
],
});
// Report usage back to UsageTap
setUsage({
modelUsed: model,
inputTokens: response.usage?.prompt_tokens ?? 0,
responseTokens: response.usage?.completion_tokens ?? 0,
});
return response.choices[0].message.content;
}
);
return completion;
}
// Example usage
generateSummary("customer_123", "Long document text here...")
.then((summary) => console.log("Summary:", summary))
.catch((error) => console.error("Error:", error));
Key Features:
- ✅
withUsagehandles entire lifecycle - ✅ Automatic error handling and usage reporting
- ✅ Works in any Node.js environment
React Chat UI
File: components/Chat.tsx
import { useChatWithUsage } from "@usagetap/sdk/react";
interface ChatProps {
userId: string;
}
export function Chat({ userId }: ChatProps) {
const { messages, input, setInput, handleSubmit, isLoading, error } =
useChatWithUsage({
api: "/api/chat", // Your Next.js API route
customerId: userId,
feature: "chat.assistant",
});
return (
<div className="chat-container">
{/* Messages display */}
<div className="messages">
{messages.map((m) => (
<div key={m.id} className={`message ${m.role}`}>
<strong>{m.role}:</strong>
<p>{m.content}</p>
</div>
))}
</div>
{/* Error display */}
{error && (
<div className="error">
Error: {error.message}
</div>
)}
{/* Input form */}
<form onSubmit={handleSubmit} className="input-form">
<input
value={input}
onChange={(e) => setInput(e.target.value)}
placeholder="Type a message..."
disabled={isLoading}
/>
<button type="submit" disabled={isLoading}>
{isLoading ? "Sending..." : "Send"}
</button>
</form>
</div>
);
}
Key Features:
- ✅ React hook manages entire chat state
- ✅ Automatic usage tracking through API endpoint
- ✅ Loading and error states included
Idempotency Best Practices
Idempotency ensures safe retries without duplicate charges. UsageTap supports three approaches:
1. Explicit Idempotency Keys (Recommended)
Generate a unique key per logical operation:
import crypto from "crypto";
await usageTap.beginCall({
customerId: "cust_123",
feature: "chat.completions",
// Generate unique key for this request
idempotencyKey: crypto.randomUUID(),
requested: {
standard: true,
premium: true,
},
});
When to use:
- API endpoints that may be retried by clients
- Background jobs that might restart
- Critical operations requiring duplicate prevention
2. Deterministic Keys
Use request-specific data to generate consistent keys:
import crypto from "crypto";
function generateIdempotencyKey(
userId: string,
sessionId: string,
messageId: string
): string {
const data = `${userId}:${sessionId}:${messageId}`;
return crypto.createHash("sha256").update(data).digest("hex");
}
await usageTap.beginCall({
customerId: userId,
feature: "chat.send",
idempotencyKey: generateIdempotencyKey(userId, sessionId, messageId),
requested: { standard: true, premium: true },
});
When to use:
- Multi-step workflows where steps might retry
- Distributed systems with at-least-once delivery
- Request IDs already exist in your system
3. Auto-Generated Keys (Default)
Omit idempotencyKey and let UsageTap derive one:
await usageTap.beginCall({
customerId: "cust_123",
feature: "chat.completions",
// No idempotencyKey - UsageTap generates deterministically
requested: { standard: true, premium: true },
});
How it works:
- UsageTap hashes:
orgId + customerId + feature + requested entitlements - Same inputs = same
callIdreturned - Great for bulk operations with natural deduplication
When to use:
- Simple scripts without explicit request IDs
- Internal tools where deduplication by inputs is acceptable
- Testing and development
4. Header-Based Idempotency
For raw HTTP requests, use the Idempotency-Key header:
const response = await fetch(`${baseUrl}/call_begin`, {
method: "POST",
headers: {
"Authorization": `Bearer ${apiKey}`,
"Content-Type": "application/json",
"Idempotency-Key": crypto.randomUUID(),
"Accept": "application/vnd.usagetap.v1+json",
},
body: JSON.stringify({
customerId: "cust_123",
feature: "chat.completions",
requested: { standard: true, premium: true },
holdUsd: 0.05,
}),
});
Best Practice Summary
| Scenario | Recommended Approach | Example |
|---|---|---|
| API endpoints | Explicit crypto.randomUUID() |
idempotencyKey: crypto.randomUUID() |
| Background jobs | Deterministic from job ID | idempotencyKey: job_${jobId}`` |
| Webhooks | Use webhook event ID | idempotencyKey: event.id |
| Bulk operations | Auto-generated (omit key) | No idempotencyKey field |
| Testing | Fixed string for reproducibility | idempotencyKey: "test-scenario-1" |
API Reference
Core Methods
beginCall(request, options?)
Start a usage tracking session and get entitlements.
Request:
{
customerId: string; // Required: Your customer's ID
feature?: string; // Optional: Feature being accessed
requested?: { // Optional: Requested capabilities
standard?: boolean; // Access to standard models
premium?: boolean; // Access to premium models
search?: boolean; // Web search capability
reasoningLevel?: "NONE" | "LOW" | "MEDIUM" | "HIGH";
};
idempotencyKey?: string; // Optional: Unique key for safe retries
tags?: string[]; // Optional: Tags for analytics
}
Response:
{
result: { status: "ACCEPTED", code: string, timestamp: string },
data: {
callId: string; // Use this in endCall()
allowed: { // What customer can actually use
standard: boolean;
premium: boolean;
search: boolean;
reasoningLevel: "NONE" | "LOW" | "MEDIUM" | "HIGH";
};
entitlementHints: {
suggestedModelTier: "premium" | "standard" | "none";
policy: "NONE" | "BLOCK" | "DOWNGRADE";
};
subscription: {
planName: string;
limitType: string;
// ... more subscription details
};
meters: { // Current usage levels
[meterName: string]: {
remaining: number | null;
limit: number | null;
used: number;
unlimited: boolean;
ratio: number | null;
};
};
},
correlationId: string;
}
endCall(request, options?)
Report actual usage for a call.
Request:
{
callId: string; // Required: From beginCall response
modelUsed?: string; // Model identifier (e.g., "gpt-4o")
inputTokens?: number; // Prompt tokens
responseTokens?: number; // Completion tokens
reasoningTokens?: number; // Reasoning tokens (o1 models)
searches?: number; // Number of web searches
audioSeconds?: number; // Audio processing time
error?: { // If call failed
code: string;
message: string;
};
}
Response:
{
result: { status: "ACCEPTED", code: string, timestamp: string },
data: {
callId: string;
costUSD: number; // Calculated cost
metered: { // Usage that was counted
tokens: number;
calls: number;
searches: number;
};
balances: { // Remaining quotas
tokensRemaining: number;
searchesRemaining: number;
};
},
correlationId: string;
}
withUsage(request, handler, options?)
High-level helper that handles entire lifecycle:
const result = await usageTap.withUsage(
{
customerId: "cust_123",
feature: "chat.send",
idempotencyKey: crypto.randomUUID(),
requested: { standard: true, premium: true },
},
async ({ begin, setUsage, setError }) => {
// Your LLM call here
const response = await openai.chat.completions.create({
model: begin.data.allowed.premium ? "gpt-4o" : "gpt-4o-mini",
messages: [{ role: "user", content: "Hello" }],
});
// Report usage
setUsage({
modelUsed: response.model,
inputTokens: response.usage?.prompt_tokens ?? 0,
responseTokens: response.usage?.completion_tokens ?? 0,
});
return response.choices[0].message.content;
}
);
Automatic behavior:
- Calls
beginCallbefore handler - Calls
endCallafter handler (even if it throws) - Captures errors and reports them
- Returns handler result
createCustomer(request, options?)
Ensure a customer subscription exists before making calls.
Request:
{
customerId: string; // Required: Your customer's ID
customerFriendlyName?: string; // HIGHLY IMPORTANT BUT OPTIONAL: Display name (aka customerName)
customerEmail?: string; // HIGHLY IMPORTANT BUT OPTIONAL: For billing notifications
stripeCustomerId?: string; // Link to Stripe customer
}
Response:
{
result: { status: "ACCEPTED", ... },
data: {
customerId: string;
newCustomer: boolean; // true if just created
allowed: { ... }; // Current entitlements
subscription: { ... }; // Subscription details
plan: { ... }; // Active plan info
},
correlationId: string;
}
Idempotent: Safe to call multiple times. Returns newCustomer: false if customer already exists.
changePlan(request, options?)
Switch customer to different usage plan.
Request:
{
customerId: string; // Required
planId: string; // Required: Target plan ID
strategy?: "IMMEDIATE_RESET" | "IMMEDIATE_PRORATED" | "AT_NEXT_REPLENISH";
}
Strategy options:
IMMEDIATE_RESET: Switch now, reset usage to zeroIMMEDIATE_PRORATED: Switch now, prorate existing usageAT_NEXT_REPLENISH: Schedule change for next billing cycle (default)
checkUsage(request, options?)
Query current usage status without creating a call.
Request:
{
customerId: string; // Required
}
Response:
Same structure as beginCall but without creating a call record. Use for dashboard widgets or pre-flight checks.
Error Handling
All SDK methods throw UsageTapError on failure:
import { UsageTapError, isUsageTapError } from "@usagetap/sdk";
try {
await usageTap.beginCall({
customerId: "cust_123",
idempotencyKey: crypto.randomUUID(),
});
} catch (error) {
if (isUsageTapError(error)) {
console.error("UsageTap error:", {
code: error.code, // Error code (e.g., "USAGETAP_RATE_LIMITED")
message: error.message, // Human-readable message
retryable: error.retryable, // Whether retry might succeed
correlationId: error.correlationId, // For support
details: error.details, // Additional context
});
if (error.retryable) {
// Retry with exponential backoff
} else {
// Permanent error - handle gracefully
}
} else {
// Non-UsageTap error
console.error("Unexpected error:", error);
}
}
Common Error Codes:
USAGETAP_AUTH_ERROR: Invalid API keyUSAGETAP_RATE_LIMITED: Too many requests (retryable)USAGETAP_BAD_REQUEST: Invalid request parametersUSAGETAP_SERVER_ERROR: Server issue (retryable)USAGETAP_NETWORK_ERROR: Network failure (retryable)
Automatic Retries: The SDK retries transient errors automatically with exponential backoff. Configure retry behavior:
const usageTap = new UsageTapClient({
apiKey: process.env.USAGETAP_API_KEY!,
baseUrl: process.env.USAGETAP_BASE_URL!,
retries: {
maxAttempts: 3, // Default: 3
baseDelayMs: 250, // Default: 250ms
maxDelayMs: 5000, // Default: 5000ms
jitterRatio: 0.2, // Default: 0.2 (20% jitter)
},
});
Quick Reference: Entitlement Mapping
How to use begin.data.allowed to configure LLM calls:
const begin = await usageTap.beginCall({
customerId: "cust_123",
idempotencyKey: crypto.randomUUID(),
requested: {
standard: true,
premium: true,
search: true,
reasoningLevel: "HIGH",
},
});
// Select model tier
const model = begin.data.allowed.premium ? "gpt-4o" : "gpt-4o-mini";
// Configure reasoning effort (for o1 models)
const reasoningEffort = (() => {
switch (begin.data.allowed.reasoningLevel) {
case "HIGH": return "high";
case "MEDIUM": return "medium";
case "LOW": return "low";
default: return undefined;
}
})();
// Enable web search if allowed
const tools = begin.data.allowed.search
? [{ type: "web_search" }]
: undefined;
// Make the call
const response = await openai.chat.completions.create({
model,
messages: [...],
reasoning: reasoningEffort ? { effort: reasoningEffort } : undefined,
tools,
});
Migration from Direct OpenAI Usage
Before (Direct OpenAI):
const response = await openai.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Hello" }],
});
After (With UsageTap):
const completion = await usageTap.withUsage(
{
customerId: userId,
feature: "chat.send",
idempotencyKey: crypto.randomUUID(),
requested: { standard: true, premium: true },
},
async ({ begin, setUsage }) => {
const model = begin.data.allowed.premium ? "gpt-4o" : "gpt-4o-mini";
const response = await openai.chat.completions.create({
model,
messages: [{ role: "user", content: "Hello" }],
});
setUsage({
modelUsed: model,
inputTokens: response.usage?.prompt_tokens ?? 0,
responseTokens: response.usage?.completion_tokens ?? 0,
});
return response.choices[0].message.content;
}
);
Changes Required:
- Wrap call in
withUsage() - Get customer ID from auth system
- Add
idempotencyKey: crypto.randomUUID() - Use entitlements to select model
- Report usage with
setUsage()
That's it! 🎉
Support
- Documentation: https://usagetap.com/docs
- API Reference: https://usagetap.com/api
- GitHub: https://github.com/usagetap/sdk
- Email: support@usagetap.com