Synthesis: Technical Literacy
The Big Picture
Section titled “The Big Picture”You’ve worked through five lessons: how prompt engineering steers AI behavior, how RAG gives models access to proprietary data, when fine-tuning is justified, how to choose the right model, and how to run AI features economically.
Individually, these are technical tools. Together, they form an optimization hierarchy: Lesson 1 is the fastest, cheapest lever. Lesson 2 extends the model’s knowledge. Lesson 3 changes its behavior. Lesson 4 sets the quality ceiling and cost floor. Lesson 5 runs across all layers — caching, routing, batching, and monitoring apply regardless of which techniques you use.
The PM who reaches for fine-tuning before exhausting prompting wastes weeks and thousands of dollars. The PM who uses a frontier model for every request wastes money on quality users don’t need.
Connections
Section titled “Connections”1. The Optimization Hierarchy
Section titled “1. The Optimization Hierarchy”Prompt engineering (L1) is always the first step. If output quality isn’t right, improve the prompt first. If the model needs current data, add RAG (L2). If consistent behavior is needed and prompting can’t achieve it, consider fine-tuning (L3). Model choice (L4) determines how high quality can go and how low costs can fall. Cost optimization (L5) runs in parallel with everything.
For you as a PM: This path isn’t a suggestion — it’s cost protection. Each step is an order of magnitude more expensive and less flexible than the previous one.
2. The Build-vs-Configure Spectrum
Section titled “2. The Build-vs-Configure Spectrum”Each lesson represents a different point on the spectrum between configuration and development. Prompt engineering (hours, changeable anytime, no lock-in) sits at one end. Fine-tuning (days to weeks, retraining to change, high lock-in) at the other. RAG and model selection fall in between.
For you as a PM: Prefer configuration over building. The more deeply baked a decision is, the harder it is to change. Start flexible, lock down only when measurements justify the investment.
3. Quality Is Not a Single Number
Section titled “3. Quality Is Not a Single Number”“Quality” means something different in every lesson: prompt quality measures format, tone, and accuracy. RAG quality measures retrieval precision and answer faithfulness. Fine-tuning quality measures behavioral consistency. Model selection quality measures task-specific performance. Cost/quality measures the minimum users value.
For you as a PM: Define “good enough” concretely and measurably before optimization work begins. “Make it better” is not a product requirement.
4. Connections to Earlier Chapters
Section titled “4. Connections to Earlier Chapters”Prompt engineering (L1) directly applies the token, context window, and temperature concepts from Chapter 01. RAG (L2) is the primary hallucination mitigation strategy from the Foundations. Model selection (L4) builds on the ML landscape understanding from Chapter 01. Cost/quality tradeoffs (L5) are central to AI product strategy from Chapter 02. And prompt quality determines what the UX design from Chapter 03 can show the user.
5. The Technical Decision Map
Section titled “5. The Technical Decision Map”When facing a technical AI decision, use this routing:
| Problem | Relevant lesson | First action |
|---|---|---|
| AI output quality isn’t good enough | Prompt Engineering (L1) | Improve the prompt |
| Model doesn’t know about our data | RAG (L2) | Build a retrieval pipeline |
| Model’s tone/style isn’t right | Fine-Tuning (L3) | Try system prompt first; fine-tune only if prompt fails |
| Which model should we use? | Model Selection (L4) | Run blind evaluation on 50+ representative queries |
| AI costs are too high | Cost/Quality (L5) | Implement model routing as the highest-leverage fix |
The Meta-Insight
Section titled “The Meta-Insight”Technical literacy for PMs doesn’t mean writing code. It means asking the right questions: “Did we optimize the prompt before discussing fine-tuning?” and “What’s the minimum quality users value — and what does it cost?” Whoever asks these questions makes better decisions than the person who memorizes every benchmark.
Your Technical Literacy Checklist
Section titled “Your Technical Literacy Checklist”What you should now be able to do:
- Choose the right prompting technique for the task complexity (zero-shot through self-consistency) — Lesson 1
- Assess whether a feature needs RAG, fine-tuning, or better prompts — Lessons 1, 2, 3
- Understand a RAG pipeline architecturally and identify chunking quality as the main lever — Lesson 2
- Make fine-tuning decisions using the 6-question matrix — Lesson 3
- Select models based on task, cost, and latency rather than leaderboard ranking — Lesson 4
- Apply multi-model routing as an architectural pattern — Lessons 4, 5
- Calculate AI feature costs and prioritize the six optimization levers — Lesson 5
- Define “good enough” before optimization begins — Lesson 5
If any of these feel uncertain, go back to the relevant lesson. These technical foundations determine whether your AI feature is production-ready — or remains an expensive experiment.
Continue with: AI Evaluation
Section titled “Continue with: AI Evaluation”You know the technology. Chapter 5 shows how to measure AI quality and make ship/no-ship decisions.
Self-Assessment
Section titled “Self-Assessment”Three scenarios combining multiple concepts from this chapter. Think through your answer before revealing the solution.
Scenario 1: The Hallucinating Support Bot
Section titled “Scenario 1: The Hallucinating Support Bot”Your AI support bot answers customer questions about your products. Quality is inconsistent: the model sometimes invents features that don’t exist. Your engineering lead proposes fine-tuning on your support data. What’s your move?
Solution
Before fine-tuning (Lesson 3) even enters the conversation, work through the optimization hierarchy top-down. Hallucinated features suggest the model lacks access to current product data — that’s a RAG problem (Lesson 2), not a behavior problem. First, improve the system prompt with clear instructions like “Only answer based on the provided documents” (Lesson 1), then build a RAG pipeline over your product documentation. Fine-tuning would only be justified if prompting + RAG fail to solve the problem.
Scenario 2: The Expensive Default
Section titled “Scenario 2: The Expensive Default”Your AI feature uses GPT-4o for every request — from simple FAQ answers to complex troubleshooting. Monthly API costs have climbed to $50,000, and leadership wants them cut in half without hurting user satisfaction. What do you propose?
Solution
This is a textbook case for multi-model routing (Lessons 4 + 5). Analyze your request distribution: simple FAQ queries (likely 60-70% of volume) can be handled by a cheaper model, while complex diagnostics stay on the frontier model. Combine this with caching for frequently asked questions (Lesson 5). The key is to define “good enough” per request category first (Lesson 5) and validate with a blind evaluation on 50+ queries (Lesson 4) that the smaller model is sufficient for simple cases.
Scenario 3: The Tone Problem
Section titled “Scenario 3: The Tone Problem”Your product is a learning platform. The product team wants the AI to respond like an encouraging tutor — not neutral and clinical. After three weeks of prompt iteration, engineering reports: “The tone is right 70% of the time, but 30% of responses revert to the default style.” Is fine-tuning worth it?
Solution
Three lessons come into play here. First, check whether few-shot prompting with 3-5 examples of the desired style reduces the 30% outliers (Lesson 1). If not, this is a legitimate fine-tuning candidate (Lesson 3) — consistent style behavior is exactly the use case fine-tuning excels at, and three weeks of prompt iteration show that prompting has hit its limits. But before fine-tuning, check model selection (Lesson 4): some models follow style instructions better than others. Switching models costs hours; fine-tuning costs weeks.
Sources: Building on Lessons 1-5. IBM RAG vs Fine-Tuning vs Prompt Engineering, a16z LLMflation, DAIR.AI Prompt Engineering Guide, Artificial Analysis LLM Leaderboard, Pinecone RAG Architecture Guide