UsageTap Integration Guide for AI Coding Tools

Quick Start for LLM Agents: This guide provides everything needed to integrate UsageTap into applications in a single pass. Copy the relevant code snippet for your stack, replace the placeholders, and you're done.

Overview
Installation
One-Shot Integration Examples
Idempotency Best Practices
API Reference
Error Handling

Overview

UsageTap tracks LLM API usage and enforces quotas automatically. The flow is simple:

Begin Call: Request entitlements for a customer/feature
Execute LLM Call: Use the allowed capabilities from step 1
End Call: Report actual usage back to UsageTap

The SDK handles retries, idempotency, and automatic usage tracking for you.

Installation

npm install @usagetap/sdk openai

Environment Variables Required:

USAGETAP_API_KEY=your_api_key_here
USAGETAP_BASE_URL=https://api.usagetap.com/
OPENAI_API_KEY=sk-...

One-Shot Integration Examples

Next.js App Router

File: app/api/chat/route.ts

import { NextRequest } from "next/server";
import OpenAI from "openai";
import { UsageTapClient } from "@usagetap/sdk";
import { wrapOpenAI, toNextResponse } from "@usagetap/sdk/openai";

// Initialize clients
const usageTap = new UsageTapClient({
  apiKey: process.env.USAGETAP_API_KEY!,
  baseUrl: process.env.USAGETAP_BASE_URL!,
});

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY!,
});

// Wrap OpenAI client with UsageTap tracking
const ai = wrapOpenAI(openai, usageTap, {
  defaultContext: {
    // Set defaults that apply to all calls
    feature: "chat.completions",
    requested: {
      standard: true,
      premium: true,
      search: true,
      reasoningLevel: "HIGH",
    },
  },
});

export async function POST(req: NextRequest) {
  try {
    // Get user ID from your auth system
    const userId = req.headers.get("x-user-id") || "anonymous";
    
    const { messages } = await req.json();

    // Make streaming LLM call with automatic usage tracking
    const stream = await ai.chat.completions.create(
      {
        messages,
        stream: true,
        // model is optional - UsageTap selects based on entitlements
      },
      {
        usageTap: {
          customerId: userId,
          // Generate unique idempotency key for this request
          idempotencyKey: crypto.randomUUID(),
        },
      }
    );

    // Return streaming response
    return toNextResponse(stream, { mode: "text" });
  } catch (error) {
    console.error("Chat error:", error);
    return new Response(
      JSON.stringify({ error: "Failed to process chat request" }),
      { status: 500, headers: { "Content-Type": "application/json" } }
    );
  }
}

Key Features:

✅ Automatic model selection based on customer's plan
✅ Usage tracking with zero boilerplate
✅ Idempotent requests with idempotencyKey
✅ Streaming responses

Express.js Server

File: server.ts

import express from "express";
import OpenAI from "openai";
import { UsageTapClient } from "@usagetap/sdk";
import { withUsage } from "@usagetap/sdk/express";

const app = express();
app.use(express.json());

// Initialize UsageTap client
const usageTap = new UsageTapClient({
  apiKey: process.env.USAGETAP_API_KEY!,
  baseUrl: process.env.USAGETAP_BASE_URL!,
});

// Attach UsageTap context to all requests
app.use(
  withUsage(usageTap, (req) => {
    // Extract customer ID from your auth middleware
    return req.user?.id || "anonymous";
  })
);

// Chat endpoint with automatic tracking
app.post("/api/chat", async (req, res) => {
  try {
    const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY! });
    
    // Get UsageTap-wrapped OpenAI client from request
    const ai = req.usageTap!.openai(openai, {
      feature: "chat.assistant",
      requested: {
        standard: true,
        premium: true,
        search: true,
        reasoningLevel: "HIGH",
      },
    });

    const stream = await ai.chat.completions.create(
      {
        messages: req.body.messages,
        stream: true,
      },
      {
        usageTap: {
          // Auto-generate idempotency key
          idempotencyKey: crypto.randomUUID(),
        },
      }
    );

    // Pipe stream to response and finalize usage automatically
    req.usageTap!.pipeToResponse(stream, res);
  } catch (error) {
    console.error("Chat error:", error);
    res.status(500).json({ error: "Failed to process chat request" });
  }
});

app.listen(3000, () => {
  console.log("Server running on http://localhost:3000");
});

Key Features:

✅ Middleware attaches UsageTap to every request
✅ Extract customer ID once from auth system
✅ Automatic usage tracking and streaming

Node.js Script

File: generate-summary.ts

import OpenAI from "openai";
import { UsageTapClient } from "@usagetap/sdk";
import crypto from "crypto";

// Initialize clients
const usageTap = new UsageTapClient({
  apiKey: process.env.USAGETAP_API_KEY!,
  baseUrl: process.env.USAGETAP_BASE_URL!,
});

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY!,
});

async function generateSummary(customerId: string, text: string) {
  // High-level helper that handles begin → call → end automatically
  const completion = await usageTap.withUsage(
    {
      customerId,
      feature: "summarization",
      requested: {
        standard: true,
        premium: true,
        search: false,
        reasoningLevel: "LOW",
      },
      // Generate idempotency key for safe retries
      idempotencyKey: crypto.randomUUID(),
    },
    async ({ begin, setUsage }) => {
      // Select model based on what customer is allowed
      const model = begin.data.allowed.premium ? "gpt-4o" : "gpt-4o-mini";

      // Make LLM call
      const response = await openai.chat.completions.create({
        model,
        messages: [
          {
            role: "system",
            content: "Summarize the following text concisely.",
          },
          {
            role: "user",
            content: text,
          },
        ],
      });

      // Report usage back to UsageTap
      setUsage({
        modelUsed: model,
        inputTokens: response.usage?.prompt_tokens ?? 0,
        responseTokens: response.usage?.completion_tokens ?? 0,
      });

      return response.choices[0].message.content;
    }
  );

  return completion;
}

// Example usage
generateSummary("customer_123", "Long document text here...")
  .then((summary) => console.log("Summary:", summary))
  .catch((error) => console.error("Error:", error));

Key Features:

✅ withUsage handles entire lifecycle
✅ Automatic error handling and usage reporting
✅ Works in any Node.js environment

React Chat UI

File: components/Chat.tsx

import { useChatWithUsage } from "@usagetap/sdk/react";

interface ChatProps {
  userId: string;
}

export function Chat({ userId }: ChatProps) {
  const { messages, input, setInput, handleSubmit, isLoading, error } =
    useChatWithUsage({
      api: "/api/chat", // Your Next.js API route
      customerId: userId,
      feature: "chat.assistant",
    });

  return (
    <div className="chat-container">
      {/* Messages display */}
      <div className="messages">
        {messages.map((m) => (
          <div key={m.id} className={`message ${m.role}`}>
            <strong>{m.role}:</strong>
            <p>{m.content}</p>
          </div>
        ))}
      </div>

      {/* Error display */}
      {error && (
        <div className="error">
          Error: {error.message}
        </div>
      )}

      {/* Input form */}
      <form onSubmit={handleSubmit} className="input-form">
        <input
          value={input}
          onChange={(e) => setInput(e.target.value)}
          placeholder="Type a message..."
          disabled={isLoading}
        />
        <button type="submit" disabled={isLoading}>
          {isLoading ? "Sending..." : "Send"}
        </button>
      </form>
    </div>
  );
}

Key Features:

✅ React hook manages entire chat state
✅ Automatic usage tracking through API endpoint
✅ Loading and error states included

Idempotency Best Practices

Idempotency ensures safe retries without duplicate charges. UsageTap supports three approaches:

1. Explicit Idempotency Keys (Recommended)

Generate a unique key per logical operation:

import crypto from "crypto";

await usageTap.beginCall({
  customerId: "cust_123",
  feature: "chat.completions",
  // Generate unique key for this request
  idempotencyKey: crypto.randomUUID(),
  requested: {
    standard: true,
    premium: true,
  },
});

When to use:

API endpoints that may be retried by clients
Background jobs that might restart
Critical operations requiring duplicate prevention

2. Deterministic Keys

Use request-specific data to generate consistent keys:

import crypto from "crypto";

function generateIdempotencyKey(
  userId: string,
  sessionId: string,
  messageId: string
): string {
  const data = `${userId}:${sessionId}:${messageId}`;
  return crypto.createHash("sha256").update(data).digest("hex");
}

await usageTap.beginCall({
  customerId: userId,
  feature: "chat.send",
  idempotencyKey: generateIdempotencyKey(userId, sessionId, messageId),
  requested: { standard: true, premium: true },
});

When to use:

Multi-step workflows where steps might retry
Distributed systems with at-least-once delivery
Request IDs already exist in your system

3. Auto-Generated Keys (Default)

Omit idempotencyKey and let UsageTap derive one:

await usageTap.beginCall({
  customerId: "cust_123",
  feature: "chat.completions",
  // No idempotencyKey - UsageTap generates deterministically
  requested: { standard: true, premium: true },
});

How it works:

UsageTap hashes: orgId + customerId + feature + requested entitlements
Same inputs = same callId returned
Great for bulk operations with natural deduplication

When to use:

Simple scripts without explicit request IDs
Internal tools where deduplication by inputs is acceptable
Testing and development

4. Header-Based Idempotency

For raw HTTP requests, use the Idempotency-Key header:

const response = await fetch(`${baseUrl}/call_begin`, {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${apiKey}`,
    "Content-Type": "application/json",
    "Idempotency-Key": crypto.randomUUID(),
    "Accept": "application/vnd.usagetap.v1+json",
  },
  body: JSON.stringify({
    customerId: "cust_123",
    feature: "chat.completions",
    requested: { standard: true, premium: true },
    holdUsd: 0.05,
  }),
});

Best Practice Summary

Scenario	Recommended Approach	Example
API endpoints	Explicit `crypto.randomUUID()`	`idempotencyKey: crypto.randomUUID()`
Background jobs	Deterministic from job ID	`idempotencyKey:` job_${jobId}``
Webhooks	Use webhook event ID	`idempotencyKey: event.id`
Bulk operations	Auto-generated (omit key)	No `idempotencyKey` field
Testing	Fixed string for reproducibility	`idempotencyKey: "test-scenario-1"`

API Reference

Core Methods

`beginCall(request, options?)`

Start a usage tracking session and get entitlements.

Request:

{
  customerId: string;           // Required: Your customer's ID
  feature?: string;             // Optional: Feature being accessed
  requested?: {                 // Optional: Requested capabilities
    standard?: boolean;         // Access to standard models
    premium?: boolean;          // Access to premium models
    search?: boolean;           // Web search capability
    reasoningLevel?: "NONE" | "LOW" | "MEDIUM" | "HIGH";
  };
  idempotencyKey?: string;      // Optional: Unique key for safe retries
  tags?: string[];              // Optional: Tags for analytics
}

Response:

{
  result: { status: "ACCEPTED", code: string, timestamp: string },
  data: {
    callId: string;             // Use this in endCall()
    allowed: {                  // What customer can actually use
      standard: boolean;
      premium: boolean;
      search: boolean;
      reasoningLevel: "NONE" | "LOW" | "MEDIUM" | "HIGH";
    };
    entitlementHints: {
      suggestedModelTier: "premium" | "standard" | "none";
      policy: "NONE" | "BLOCK" | "DOWNGRADE";
    };
    subscription: {
      planName: string;
      limitType: string;
      // ... more subscription details
    };
    meters: {                   // Current usage levels
      [meterName: string]: {
        remaining: number | null;
        limit: number | null;
        used: number;
        unlimited: boolean;
        ratio: number | null;
      };
    };
  },
  correlationId: string;
}

`endCall(request, options?)`

Report actual usage for a call.

Request:

{
  callId: string;               // Required: From beginCall response
  modelUsed?: string;           // Model identifier (e.g., "gpt-4o")
  inputTokens?: number;         // Prompt tokens
  responseTokens?: number;      // Completion tokens
  reasoningTokens?: number;     // Reasoning tokens (o1 models)
  searches?: number;            // Number of web searches
  audioSeconds?: number;        // Audio processing time
  error?: {                     // If call failed
    code: string;
    message: string;
  };
}

Response:

{
  result: { status: "ACCEPTED", code: string, timestamp: string },
  data: {
    callId: string;
    costUSD: number;            // Calculated cost
    metered: {                  // Usage that was counted
      tokens: number;
      calls: number;
      searches: number;
    };
    balances: {                 // Remaining quotas
      tokensRemaining: number;
      searchesRemaining: number;
    };
  },
  correlationId: string;
}

`withUsage(request, handler, options?)`

High-level helper that handles entire lifecycle:

const result = await usageTap.withUsage(
  {
    customerId: "cust_123",
    feature: "chat.send",
    idempotencyKey: crypto.randomUUID(),
    requested: { standard: true, premium: true },
  },
  async ({ begin, setUsage, setError }) => {
    // Your LLM call here
    const response = await openai.chat.completions.create({
      model: begin.data.allowed.premium ? "gpt-4o" : "gpt-4o-mini",
      messages: [{ role: "user", content: "Hello" }],
    });

    // Report usage
    setUsage({
      modelUsed: response.model,
      inputTokens: response.usage?.prompt_tokens ?? 0,
      responseTokens: response.usage?.completion_tokens ?? 0,
    });

    return response.choices[0].message.content;
  }
);

Automatic behavior:

Calls beginCall before handler
Calls endCall after handler (even if it throws)
Captures errors and reports them
Returns handler result

`createCustomer(request, options?)`

Ensure a customer subscription exists before making calls.

Request:

{
  customerId: string;           // Required: Your customer's ID
  customerFriendlyName?: string; // HIGHLY IMPORTANT BUT OPTIONAL: Display name (aka customerName)
  customerEmail?: string;       // HIGHLY IMPORTANT BUT OPTIONAL: For billing notifications
  stripeCustomerId?: string;    // Link to Stripe customer
}

Response:

{
  result: { status: "ACCEPTED", ... },
  data: {
    customerId: string;
    newCustomer: boolean;       // true if just created
    allowed: { ... };           // Current entitlements
    subscription: { ... };      // Subscription details
    plan: { ... };              // Active plan info
  },
  correlationId: string;
}

Idempotent: Safe to call multiple times. Returns newCustomer: false if customer already exists.

`changePlan(request, options?)`

Switch customer to different usage plan.

Request:

{
  customerId: string;           // Required
  planId: string;               // Required: Target plan ID
  strategy?: "IMMEDIATE_RESET" | "IMMEDIATE_PRORATED" | "AT_NEXT_REPLENISH";
}

Strategy options:

IMMEDIATE_RESET: Switch now, reset usage to zero
IMMEDIATE_PRORATED: Switch now, prorate existing usage
AT_NEXT_REPLENISH: Schedule change for next billing cycle (default)

`checkUsage(request, options?)`

Query current usage status without creating a call.

Request:

{
  customerId: string;           // Required
}

Response: Same structure as beginCall but without creating a call record. Use for dashboard widgets or pre-flight checks.

Error Handling

All SDK methods throw UsageTapError on failure:

import { UsageTapError, isUsageTapError } from "@usagetap/sdk";

try {
  await usageTap.beginCall({
    customerId: "cust_123",
    idempotencyKey: crypto.randomUUID(),
  });
} catch (error) {
  if (isUsageTapError(error)) {
    console.error("UsageTap error:", {
      code: error.code,          // Error code (e.g., "USAGETAP_RATE_LIMITED")
      message: error.message,    // Human-readable message
      retryable: error.retryable, // Whether retry might succeed
      correlationId: error.correlationId, // For support
      details: error.details,    // Additional context
    });

    if (error.retryable) {
      // Retry with exponential backoff
    } else {
      // Permanent error - handle gracefully
    }
  } else {
    // Non-UsageTap error
    console.error("Unexpected error:", error);
  }
}

Common Error Codes:

USAGETAP_AUTH_ERROR: Invalid API key
USAGETAP_RATE_LIMITED: Too many requests (retryable)
USAGETAP_BAD_REQUEST: Invalid request parameters
USAGETAP_SERVER_ERROR: Server issue (retryable)
USAGETAP_NETWORK_ERROR: Network failure (retryable)

Automatic Retries: The SDK retries transient errors automatically with exponential backoff. Configure retry behavior:

const usageTap = new UsageTapClient({
  apiKey: process.env.USAGETAP_API_KEY!,
  baseUrl: process.env.USAGETAP_BASE_URL!,
  retries: {
    maxAttempts: 3,       // Default: 3
    baseDelayMs: 250,     // Default: 250ms
    maxDelayMs: 5000,     // Default: 5000ms
    jitterRatio: 0.2,     // Default: 0.2 (20% jitter)
  },
});

Quick Reference: Entitlement Mapping

How to use begin.data.allowed to configure LLM calls:

const begin = await usageTap.beginCall({
  customerId: "cust_123",
  idempotencyKey: crypto.randomUUID(),
  requested: {
    standard: true,
    premium: true,
    search: true,
    reasoningLevel: "HIGH",
  },
});

// Select model tier
const model = begin.data.allowed.premium ? "gpt-4o" : "gpt-4o-mini";

// Configure reasoning effort (for o1 models)
const reasoningEffort = (() => {
  switch (begin.data.allowed.reasoningLevel) {
    case "HIGH": return "high";
    case "MEDIUM": return "medium";
    case "LOW": return "low";
    default: return undefined;
  }
})();

// Enable web search if allowed
const tools = begin.data.allowed.search
  ? [{ type: "web_search" }]
  : undefined;

// Make the call
const response = await openai.chat.completions.create({
  model,
  messages: [...],
  reasoning: reasoningEffort ? { effort: reasoningEffort } : undefined,
  tools,
});

Migration from Direct OpenAI Usage

Before (Direct OpenAI):

const response = await openai.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello" }],
});

After (With UsageTap):

const completion = await usageTap.withUsage(
  {
    customerId: userId,
    feature: "chat.send",
    idempotencyKey: crypto.randomUUID(),
    requested: { standard: true, premium: true },
  },
  async ({ begin, setUsage }) => {
    const model = begin.data.allowed.premium ? "gpt-4o" : "gpt-4o-mini";
    
    const response = await openai.chat.completions.create({
      model,
      messages: [{ role: "user", content: "Hello" }],
    });

    setUsage({
      modelUsed: model,
      inputTokens: response.usage?.prompt_tokens ?? 0,
      responseTokens: response.usage?.completion_tokens ?? 0,
    });

    return response.choices[0].message.content;
  }
);

Changes Required:

Wrap call in withUsage()
Get customer ID from auth system
Add idempotencyKey: crypto.randomUUID()
Use entitlements to select model
Report usage with setUsage()

That's it! 🎉

Support

Documentation: https://usagetap.com/docs
API Reference: https://usagetap.com/api
GitHub: https://github.com/usagetap/sdk
Email: support@usagetap.com

UsageTap Integration Guide for AI Coding Tools

Table of Contents

Overview

Installation

One-Shot Integration Examples

Next.js App Router

Express.js Server

Node.js Script

React Chat UI

Idempotency Best Practices

1. Explicit Idempotency Keys (Recommended)

2. Deterministic Keys

3. Auto-Generated Keys (Default)

4. Header-Based Idempotency

Best Practice Summary

API Reference

Core Methods

beginCall(request, options?)

endCall(request, options?)

withUsage(request, handler, options?)

createCustomer(request, options?)

changePlan(request, options?)

checkUsage(request, options?)

Error Handling

Quick Reference: Entitlement Mapping

Migration from Direct OpenAI Usage

Support

`beginCall(request, options?)`

`endCall(request, options?)`

`withUsage(request, handler, options?)`

`createCustomer(request, options?)`

`changePlan(request, options?)`

`checkUsage(request, options?)`