Skip to content
EN DE

Level 2: LLM Fundamentals — Briefing

Tokens, Context Windows, and Caching — understand what happens inside the LLM and why your costs are what they are. In this level you’ll learn how LLMs break text into tokens, how to track token usage and calculate costs, why Context Windows are limited, and how Prompt Caching can reduce your costs by up to 90%.

Skill Tree — Level 2 LLM Fundamentals is the current level
  • What tokens are and how tokenization works — Subword Units, Token IDs, why German requires more tokens than English
  • How to track token usage and calculate costsresult.usage, price per 1M tokens, cost formulas
  • Why Context Windows are limited — what counts toward it, what happens when exceeded, strategies for a full window
  • How Prompt Caching can drastically reduce costs — Prefix Matching, Cache Tokens, provider support

In Level 1 you learned how to generate text with the AI SDK. But you only saw result.usage as a number — without understanding what a token is, why the numbers are what they are, and how to control them.

The concrete problem: Without understanding tokens you can’t predict costs. Without Context Window knowledge you’ll get kicked out mid-conversation. Without caching you pay full price for the same System Prompt every time. This level gives you the technical foundations you need for cost-efficient AI applications.

  • Level 1 completedgenerateText, streamText, result.usage must be solid
  • Basic understanding of API costs — You know that LLM calls cost money
  • Project directory: Continue working in the same project directory as Level 1 — all required packages (ai, @ai-sdk/anthropic, tsx) are already installed

Skip hint: You already know what Subword Tokenization is, understand the difference between input and output tokens, and have worked with Prompt Caching? Jump straight to the Boss Fight and test your knowledge.

Build a Token Budget Calculator: A tool that counts tokens in the System Prompt, User Message, and expected output, checks whether everything fits into the Context Window, calculates the expected costs, and tracks the cache hit rate across multiple calls. All four building blocks from this level in one project.

Part of AI Learning — free courses from prompt to production. Jan on LinkedIn