Skip to content
EN DE

Synthesis: Agentic AI

You’ve worked through four lessons: how multi-agent systems distribute tasks to specialists (Lesson 1), how tool use and MCP make agents capable of action (Lesson 2), which autonomy levels exist and how to choose them (Lesson 3), and how human-in-the-loop integrates human oversight as an architecture pattern (Lesson 4).

Individually, these are technical concepts and design patterns. Together, they form a model for the central question in agentic AI product management: How much should our AI do before checking with a human — and how do we earn the right to do more?

Tool use (Lesson 2) makes individual agents capable of action. Multi-agent systems (Lesson 1) emerge when a single agent hits the limits of its tools. The decision “add a new tool vs. create a new agent” is one of the most critical architecture choices — too many tools overwhelm a single agent, too many agents create coordination overhead.

For you as a PM: Start with one agent and good tools. Only split when the agent demonstrably fails with a tool set exceeding 15 tools, or when tasks clearly require different trust boundaries.

Autonomy levels (Lesson 3) define HOW MUCH the agent may do. HITL patterns (Lesson 4) define HOW human oversight is implemented. L3 (Consultant) typically maps to Approval Gates, L4 (Approver) to Escalation Triggers, L5 (Observer) to Checkpoint Audits.

For you as a PM: Determine the autonomy level first — the HITL pattern follows from it. Not the other way around.

Multi-agent systems (Lesson 1) multiply HITL complexity (Lesson 4). With a single agent, the question “when does it ask a human?” is straightforward. With 4 agents working in parallel, each needs its own escalation criteria, and the orchestrator must decide whether an escalation pauses the entire pipeline or just one branch.

For you as a PM: More agents mean exponentially more HITL design decisions. This is a hidden cost driver of multi-agent architectures.

MCP (Lesson 2) standardizes WHICH tools are available. Autonomy levels (Lesson 3) determine HOW FREELY the agent uses those tools. An agent with MCP access to a database at L2 shows query results. The same agent at L4 executes writes and notifies the user afterward.

For you as a PM: Tool access and autonomy level are two independent dimensions that together define an agent’s capabilities and risks.

An often overlooked aspect: Agentic workflows consume 10-100x more tokens per task than single-call inference. A 5-step agent with tool calls can consume 50,000+ tokens per task. The token economics from Chapter 4 need to be recalculated for agents — multi-agent systems multiply this effect further.

Trust is the central product variable in agentic AI. Every step up the trust gradient — from tool calls to multi-agent coordination to full autonomy — requires evidence, reversibility, transparency, and control. Products that start at the right trust level for their domain and provide mechanisms to move up and down will win.

What you should now be able to do:

  • Decide when multi-agent architecture is justified and choose the right orchestration pattern — Lesson 1
  • Curate a tool set for an AI agent and use MCP as the integration standard — Lesson 2
  • Determine the right autonomy level per user segment and task type — Lesson 3
  • Choose the right HITL pattern and optimize it with metrics over time — Lesson 4
  • Answer the trust gradient question: “How much may the AI do before it asks?” — All lessons

If any of these feel uncertain, go back to the relevant lesson. These concepts form the foundation for any AI product that goes beyond pure text generation.

You build AI systems. Chapter 7 shows how to make them responsible and compliant.

Three scenarios combining multiple concepts from this chapter. Think through your answer before revealing the solution.

Your recruiting agent is supposed to pre-screen applications and present a shortlist to hiring managers. The engineering team built an agent that uses MCP to access the ATS (Applicant Tracking System), scores applications, and forwards top candidates directly to interview scheduling — without hiring manager approval. The agent works flawlessly from a technical standpoint. What’s the problem?

Solution

The autonomy level is wrong (Lesson 3). Recruiting decisions have high consequences and are hard to reverse — this requires L3 (Consultant) at most, not L4 or L5. The agent should present a shortlist and wait for approval, not schedule interviews on its own. The appropriate HITL pattern (Lesson 4) is an Approval Gate: the agent prepares the decision, the human makes it. The fact that the agent has MCP write access to the ATS (Lesson 2) makes things worse — tool access and autonomy level are two independent dimensions, and both are set too high here.

Your customer success agent now has 22 tools: CRM, ticketing, billing, knowledge base, email, calendar, Slack, and more. Error rates are climbing — the agent increasingly picks the wrong tool for the job. Your engineering team proposes rebuilding as a multi-agent system. Is that the right call?

Solution

The diagnosis is correct: 22 tools overwhelm a single agent (Lesson 1 identifies 15 tools as the threshold). But before switching to multi-agent, first check whether all 22 tools are truly needed — often a reduction to core tools plus better prompts for tool selection is enough. If multi-agent is justified, group tools by trust boundaries (Lesson 1): e.g., one agent for read-only queries (CRM, knowledge base), one agent for actions with customer impact (ticketing, billing). Keep in mind the HITL multiplier (Lesson 4): each agent needs its own escalation criteria, and you must define whether an escalation pauses the entire pipeline or just one branch.

Your AI product is a coding assistant that uses MCP to access the repository, CI pipeline, and issue tracker. For experienced senior developers, it works well at L4 (Approver) — they briefly review changes and merge. But junior developers complain that the agent changes too much and they lose track. How do you solve this?

Solution

This is a case for different autonomy levels per user segment (Lesson 3). Instead of a uniform level, you need an autonomy gradient: seniors stay at L4 (Approver), juniors start at L2 (Collaborator) — the agent suggests changes and explains them rather than implementing directly. The HITL patterns (Lesson 4) change accordingly: seniors get Checkpoint Audits, juniors need Approval Gates at every step. The key: build a mechanism that lets juniors graduate to higher levels over time based on usage data — that’s the trust gradient in practice.


Sources: Building on Lessons 1-4. Anthropic MCP Docs (2024-2025), Feng et al. — Levels of Autonomy (2025), MIT AI Agent Index (2025), Martin Fowler — Humans and Agents (2025)

Part of AI Learning — free courses from prompt to production. Jan on LinkedIn