Skip to content
EN DE

Challenge 2.2: Usage Tracking

What does a single LLM call cost — and how do you find out?

generateText returns result with usage object: promptTokens, completionTokens, totalTokens flow to cost calculation

Every generateText call returns a usage object that tells you exactly how many tokens were consumed. Using the provider’s prices, you can calculate the cost per call.

Without Usage Tracking: You have no idea what your AI application costs. At the end of the month the provider bill arrives and you can’t trace which endpoint consumed how much. Cost explosion without warning.

With Usage Tracking: You know after every call what it cost. You can set budgets, trigger alerts on overruns, and optimize strategically — e.g. identify the most expensive endpoint and switch it to a cheaper model.

Every generateText and streamText call automatically returns a usage object with three fields:

import { generateText } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
const result = await generateText({
model: anthropic('claude-sonnet-4-5-20250514'),
system: 'Du bist ein hilfreicher Assistent.',
prompt: 'Was ist TypeScript?',
});
console.log(result.usage);
// → {
// promptTokens: 22, ← What you sent (system + prompt)
// completionTokens: 85, ← What the LLM generated
// totalTokens: 107 ← Sum
// }

promptTokens includes everything sent to the LLM: system, prompt, messages, tool definitions. completionTokens is the LLM’s response. The distinction matters because input and output are priced differently.

The formula is simple — tokens divided by 1 million, times price per 1M tokens:

interface ModelPricing {
inputPerMillion: number; // Price per 1M input tokens in USD
outputPerMillion: number; // Price per 1M output tokens in USD
}
// Prices of common models (as of March 2026)
const PRICING: Record<string, ModelPricing> = {
'claude-sonnet-4-5-20250514': { inputPerMillion: 3.0, outputPerMillion: 15.0 },
'gpt-4o': { inputPerMillion: 2.5, outputPerMillion: 10.0 },
'gemini-2.5-flash': { inputPerMillion: 0.15, outputPerMillion: 0.60 },
};
function calculateCost( // ← Reusable function
usage: { promptTokens: number; completionTokens: number },
modelId: string,
): { inputCost: number; outputCost: number; totalCost: number } {
const pricing = PRICING[modelId];
if (!pricing) throw new Error(`Unknown model: ${modelId}`);
const inputCost = (usage.promptTokens / 1_000_000) * pricing.inputPerMillion;
const outputCost = (usage.completionTokens / 1_000_000) * pricing.outputPerMillion;
return {
inputCost,
outputCost,
totalCost: inputCost + outputCost, // ← Total cost per call
};
}

Example: 22 prompt tokens + 85 completion tokens with Claude Sonnet:

  • Input: 22 / 1,000,000 * $3.00 = $0.000066
  • Output: 85 / 1,000,000 * $15.00 = $0.001275
  • Total: $0.001341 per call

Starting with AI SDK v6, some providers return extended token details. You’ll find these in result.usage or in providerMetadata:

const result = await generateText({
model: anthropic('claude-sonnet-4-5-20250514'),
system: 'Du bist ein hilfreicher Assistent fuer TypeScript-Fragen.',
prompt: 'Erklaere Generics in TypeScript.',
});
// Standard usage (always available)
console.log('Prompt Tokens:', result.usage.promptTokens);
console.log('Completion Tokens:', result.usage.completionTokens);
// Extended details (provider-dependent)
// For Anthropic:
// - cacheReadTokens: Tokens read from cache (cheaper!)
// - cacheCreationTokens: Tokens written to cache
// For OpenAI with reasoning models:
// - reasoningTokens: Tokens for internal reasoning

These extended details are important for cost optimization. If cacheReadTokens > 0, you pay less for those tokens — you’ll learn about that in Challenge 2.4 (Prompt Caching).

Layer 4: Session tracking across multiple calls

Section titled “Layer 4: Session tracking across multiple calls”

For an application with multiple calls, you need a running total. With generateText, the result is available directly after the await — you simply access result.usage:

import { generateText } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
let sessionTokens = { prompt: 0, completion: 0, total: 0 };
let sessionCost = 0;
async function trackedGenerate(prompt: string) {
const modelId = 'claude-sonnet-4-5-20250514';
const result = await generateText({
model: anthropic(modelId),
prompt,
});
// After the await: read usage directly from result.usage
const { usage } = result;
sessionTokens.prompt += usage.promptTokens;
sessionTokens.completion += usage.completionTokens;
sessionTokens.total += usage.totalTokens;
const cost = calculateCost(usage, modelId);
sessionCost += cost.totalCost;
console.log(`[Track] Call: ${cost.totalCost.toFixed(6)} USD | Session: ${sessionCost.toFixed(6)} USD`);
return result;
}
// Multiple calls — session costs accumulate
await trackedGenerate('Was ist TypeScript?');
await trackedGenerate('Erklaere async/await.');
await trackedGenerate('Was sind Generics?');
console.log(`\nSession total: ${sessionTokens.total} Tokens, ${sessionCost.toFixed(6)} USD`);

Note: streamText has an onFinish callback that fires when the stream ends. With generateText you don’t need a callback — the result is immediately available after the await.

Task: Build a Cost Calculator — make a generateText call, read usage, and calculate the costs based on the model pricing table.

Create a file challenge-2-2.ts:

import { generateText } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
// TODO 1: Define a PRICING table with at least 2 models
// Tip: { inputPerMillion: number, outputPerMillion: number }
// TODO 2: Implement calculateCost(usage, modelId)
// Formula: tokens / 1_000_000 * pricePerMillion
// TODO 3: Make a generateText call
// const result = await generateText({
// model: anthropic('claude-sonnet-4-5-20250514'),
// prompt: 'Erklaere in 3 Saetzen, was Machine Learning ist.',
// });
// TODO 4: Calculate and log the costs
// console.log('Usage:', result.usage);
// const cost = calculateCost(result.usage, 'claude-sonnet-4-5-20250514');
// console.log('Input cost:', cost.inputCost.toFixed(6), 'USD');
// console.log('Output cost:', cost.outputCost.toFixed(6), 'USD');
// console.log('Total:', cost.totalCost.toFixed(6), 'USD');

Checklist:

  • PRICING table with at least 2 models
  • calculateCost computes input and output costs separately
  • result.usage is correctly passed to calculateCost
  • Costs are logged with 6 decimal places (USD range for individual calls)
Show solution
import { generateText } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
interface ModelPricing {
inputPerMillion: number;
outputPerMillion: number;
}
const PRICING: Record<string, ModelPricing> = {
'claude-sonnet-4-5-20250514': { inputPerMillion: 3.0, outputPerMillion: 15.0 },
'gpt-4o': { inputPerMillion: 2.5, outputPerMillion: 10.0 },
};
function calculateCost(
usage: { promptTokens: number; completionTokens: number },
modelId: string,
): { inputCost: number; outputCost: number; totalCost: number } {
const pricing = PRICING[modelId];
if (!pricing) throw new Error(`Unknown model: ${modelId}`);
const inputCost = (usage.promptTokens / 1_000_000) * pricing.inputPerMillion;
const outputCost = (usage.completionTokens / 1_000_000) * pricing.outputPerMillion;
return { inputCost, outputCost, totalCost: inputCost + outputCost };
}
const result = await generateText({
model: anthropic('claude-sonnet-4-5-20250514'),
prompt: 'Erklaere in 3 Saetzen, was Machine Learning ist.',
});
console.log('Usage:', result.usage);
const cost = calculateCost(result.usage, 'claude-sonnet-4-5-20250514');
console.log('Input cost:', cost.inputCost.toFixed(6), 'USD');
console.log('Output cost:', cost.outputCost.toFixed(6), 'USD');
console.log('Total:', cost.totalCost.toFixed(6), 'USD');

Run with:

Terminal window
npx tsx challenge-2-2.ts

Expected output (approximate):

Usage: { promptTokens: 20, completionTokens: 78, totalTokens: 98 }
Input cost: 0.000060 USD
Output cost: 0.001170 USD
Total: 0.001230 USD

Explanation: The function separates input and output costs because they are priced differently. With Claude Sonnet, output costs 5x more than input — so it pays to control response length (e.g. “Answer in at most 3 sentences”).

Task flows to selectModel to model, generateText returns result.usage, calculateCost computes cost log

Exercise: Extend the selectModel function from Level 1.2 with automatic cost tracking. Every call should log its costs.

  1. Copy the selectModel function from your challenge-1-2.ts into the current file (or re-implement it)
  2. Wrap the generateText call in a trackedGenerate function
  3. After each await generateText(...), read costs from result.usage and log them
  4. Test with different tasks — compare costs between Flash and Pro models

Optional Stretch Goal: Build session tracking that sums cumulative costs across multiple calls and outputs a summary at the end (“3 calls, 450 tokens, $0.003420 USD”).

Part of AI Learning — free courses from prompt to production. Jan on LinkedIn