Putting It All Together

A single API call is useful. Chaining multiple calls together — where each step’s output becomes the next step’s input — is where the real leverage is.

This is what “compound workflows” means. And it’s the difference between using AI to answer questions and using AI to do work.

Why Chaining Matters

Consider what a single prompt can do: you describe a topic, Claude writes a summary. That’s one step.

Now consider what a chain can do:

You name a topic
Claude breaks it into sub-questions worth researching
Each sub-question triggers a web search
Each search result is read and summarized
The summaries are synthesized into a structured report
The report is saved to your notes app

That’s not something you could get from one prompt. It requires multiple models, multiple tools, and real-world actions — connected together so the output of each step feeds the next.

The good news: you don’t have to build this from scratch. The patterns are well-established, and Claude Code can build most of it for you once you understand what you’re asking for.

Case Study 1: Personal Research Assistant

What it does: User describes a topic. The system researches it, reads sources, and delivers a structured summary — automatically.

The flow:

User describes topic
        ↓
Claude breaks it into 3-5 search queries
        ↓
Web search runs for each query (in parallel)
        ↓
Claude reads the top results from each search
        ↓
Claude synthesizes findings into a structured report
        ↓
Report is saved to Notion

Technologies:

Claude API (Sonnet for synthesis, Haiku for the parallel search-reading steps)
Firecrawl (reads web pages and returns clean text)
Notion API or MCP (saves the final output)

The “aha moment”: At step 3, you’re running 4-5 searches simultaneously. Each search returns several articles. Each article gets read and summarized independently. Claude then synthesizes all of that into something coherent. You couldn’t replicate this workflow manually in under an hour — the system can do it in under two minutes.

Where humans stay in the loop: The query breakdown step is a good checkpoint. Before running the searches, the system shows you the planned queries and asks for confirmation. This prevents the entire workflow from running on a misunderstanding of what you wanted.

Key cost consideration: Use Haiku for the individual article-reading steps (you might run 15-20 of these). Switch to Sonnet only for the synthesis step. This keeps costs low on the high-volume parts.

Case Study 2: Automated Code Review Pipeline

What it does: A developer pushes code. Before it reaches the team for human review, Claude reviews the diff, flags issues, and posts structured comments directly on the pull request.

The flow:

Developer pushes code to GitHub
        ↓
GitHub webhook fires (triggers your pipeline)
        ↓
Your code fetches the diff from the GitHub API
        ↓
Claude reads the diff + relevant context (file history, project conventions)
        ↓
Claude reviews for: bugs, security issues, style violations, missing tests
        ↓
Claude posts inline comments on the PR via the GitHub API
        ↓
Claude posts a top-level summary comment

Technologies:

GitHub Webhooks (triggers the pipeline on every push)
GitHub API (fetches diffs, posts comments)
Claude API (Sonnet or Opus for the review)

Key insight: AI code review has a specific superpower — it doesn’t get fatigued. After a long day of staring at screens, humans miss things. Claude reviews every PR with the same attention. It’s particularly good at catching security issues (SQL injection patterns, hardcoded secrets, missing input validation) and inconsistencies with the rest of the codebase.

Where humans stay in the loop: Always. Claude’s comments are advisory. The human reviewer decides what to act on. The goal is to surface issues faster, not to replace judgment.

What makes this compound: The pipeline isn’t just “ask Claude to review code.” It fetches context from the git history (how has this file changed before?), pulls in the team’s style guide (what are the project conventions?), and formats output specifically for the GitHub API (inline comments on specific lines). Each step adds information that makes the final output more useful.

Case Study 3: Multi-Agent Project Builder

What it does: User describes a project they want built. Multiple specialized agents collaborate — each doing what it’s best at — and the final product is delivered without the user doing the implementation work.

The flow:

User describes the project
        ↓
Planner agent: researches approach, writes implementation plan
        ↓
Security agent: reviews the plan for risks and vulnerabilities
        ↓
[Human checkpoint: approve or revise the plan]
        ↓
Developer agent: implements the project based on the approved plan
        ↓
Reviewer agent: checks code quality, tests, and documentation
        ↓
Coordinator agent: summarizes what was built and next steps

Technologies:

Claude Code (orchestrates the agents, runs the code)
Custom agent definitions (each agent has a CLAUDE.md with its role and instructions)
Shared memory system (each agent reads context from previous agents)

Key insight: This pattern has been used to build real projects, including documentation sites like this one. Specialized agents working in sequence produce results that are faster and more thorough than any single agent or human working alone. Specialization matters — a Security agent is prompted specifically to find problems, which means it finds things a general-purpose agent would miss.

Where humans stay in the loop: At the checkpoint after the plan is written. Before any code gets written, the human approves the approach. This single gate catches the most expensive mistakes — it’s far cheaper to revise a plan than to revise an implementation.

What makes this compound: Each agent doesn’t start from scratch. The Developer agent reads the Planner’s research notes and the Security agent’s risk assessment. Memory flows through the chain. The final output reflects the thinking of all five previous steps.

Patterns That Recur

Look across all three case studies and the same design principles show up.

Break complex tasks into stages. One agent can’t research, synthesize, review, and save simultaneously without losing quality. Each stage gets a focused prompt doing one thing well. The output is more reliable and easier to debug when something goes wrong.

Use the right model for each stage. Haiku is fast and cheap. Use it for high-volume steps where the individual task is simple (read one article, classify one email, extract one field). Sonnet is your workhorse for most real work. Opus is for steps where quality matters more than cost — complex reasoning, synthesis of conflicting information, high-stakes outputs.

Build in human checkpoints. Every one of these workflows has at least one point where a human reviews before the pipeline continues. This isn’t a limitation — it’s a feature. AI working autonomously on ambiguous inputs creates compounding mistakes. A human checkpoint at the right moment catches the misunderstanding before it propagates through five more steps.

State management between steps. Each step needs to pass information forward. This can be simple (write to a file, read from a file) or structured (a database, a memory system). The important thing is that later steps can access what earlier steps learned. Don’t make Claude re-derive context it already had.

Getting Started

You don’t need to build a six-agent pipeline on your first attempt.

Start with a two-step chain: one call that generates something, a second that evaluates or transforms it. Get that working. Then add a third step. Compound workflows are built incrementally — not designed all at once.

When you’re ready to go further:

Anthropic Cookbook — Agent patterns — working code for common agentic patterns
Claude Code documentation — how to use Claude Code to build and run agents