444 lines
15 KiB
Markdown
444 lines
15 KiB
Markdown
---
|
|
name: "agent-workflow-designer"
|
|
description: "Agent Workflow Designer"
|
|
---
|
|
|
|
# Agent Workflow Designer
|
|
|
|
**Tier:** POWERFUL
|
|
**Category:** Engineering
|
|
**Domain:** Multi-Agent Systems / AI Orchestration
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
Design production-grade multi-agent orchestration systems. Covers five core patterns (sequential pipeline, parallel fan-out/fan-in, hierarchical delegation, event-driven, consensus), platform-specific implementations, handoff protocols, state management, error recovery, context window budgeting, and cost optimization.
|
|
|
|
---
|
|
|
|
## Core Capabilities
|
|
|
|
- Pattern selection guide for any orchestration requirement
|
|
- Handoff protocol templates (structured context passing)
|
|
- State management patterns for multi-agent workflows
|
|
- Error recovery and retry strategies
|
|
- Context window budget management
|
|
- Cost optimization strategies per platform
|
|
- Platform-specific configs: Claude Code Agent Teams, OpenClaw, CrewAI, AutoGen
|
|
|
|
---
|
|
|
|
## When to Use
|
|
|
|
- Building a multi-step AI pipeline that exceeds one agent's context capacity
|
|
- Parallelizing research, generation, or analysis tasks for speed
|
|
- Creating specialist agents with defined roles and handoff contracts
|
|
- Designing fault-tolerant AI workflows for production
|
|
|
|
---
|
|
|
|
## Pattern Selection Guide
|
|
|
|
```
|
|
Is the task sequential (each step needs previous output)?
|
|
YES → Sequential Pipeline
|
|
NO → Can tasks run in parallel?
|
|
YES → Parallel Fan-out/Fan-in
|
|
NO → Is there a hierarchy of decisions?
|
|
YES → Hierarchical Delegation
|
|
NO → Is it event-triggered?
|
|
YES → Event-Driven
|
|
NO → Need consensus/validation?
|
|
YES → Consensus Pattern
|
|
```
|
|
|
|
---
|
|
|
|
## Pattern 1: Sequential Pipeline
|
|
|
|
**Use when:** Each step depends on the previous output. Research → Draft → Review → Polish.
|
|
|
|
```python
|
|
# sequential_pipeline.py
|
|
from dataclasses import dataclass
|
|
from typing import Callable, Any
|
|
import anthropic
|
|
|
|
@dataclass
|
|
class PipelineStage:
|
|
name: "str"
|
|
system_prompt: str
|
|
input_key: str # what to take from state
|
|
output_key: str # what to write to state
|
|
model: str = "claude-3-5-sonnet-20241022"
|
|
max_tokens: int = 2048
|
|
|
|
class SequentialPipeline:
|
|
def __init__(self, stages: list[PipelineStage]):
|
|
self.stages = stages
|
|
self.client = anthropic.Anthropic()
|
|
|
|
def run(self, initial_input: str) -> dict:
|
|
state = {"input": initial_input}
|
|
|
|
for stage in self.stages:
|
|
print(f"[{stage.name}] Processing...")
|
|
|
|
stage_input = state.get(stage.input_key, "")
|
|
|
|
response = self.client.messages.create(
|
|
model=stage.model,
|
|
max_tokens=stage.max_tokens,
|
|
system=stage.system_prompt,
|
|
messages=[{"role": "user", "content": stage_input}],
|
|
)
|
|
|
|
state[stage.output_key] = response.content[0].text
|
|
state[f"{stage.name}_tokens"] = response.usage.input_tokens + response.usage.output_tokens
|
|
|
|
print(f"[{stage.name}] Done. Tokens: {state[f'{stage.name}_tokens']}")
|
|
|
|
return state
|
|
|
|
# Example: Blog post pipeline
|
|
pipeline = SequentialPipeline([
|
|
PipelineStage(
|
|
name="researcher",
|
|
system_prompt="You are a research specialist. Given a topic, produce a structured research brief with: key facts, statistics, expert perspectives, and controversy points.",
|
|
input_key="input",
|
|
output_key="research",
|
|
),
|
|
PipelineStage(
|
|
name="writer",
|
|
system_prompt="You are a senior content writer. Using the research provided, write a compelling 800-word blog post with a clear hook, 3 main sections, and a strong CTA.",
|
|
input_key="research",
|
|
output_key="draft",
|
|
),
|
|
PipelineStage(
|
|
name="editor",
|
|
system_prompt="You are a copy editor. Review the draft for: clarity, flow, grammar, and SEO. Return the improved version only, no commentary.",
|
|
input_key="draft",
|
|
output_key="final",
|
|
),
|
|
])
|
|
```
|
|
|
|
---
|
|
|
|
## Pattern 2: Parallel Fan-out / Fan-in
|
|
|
|
**Use when:** Independent tasks that can run concurrently. Research 5 competitors simultaneously.
|
|
|
|
```python
|
|
# parallel_fanout.py
|
|
import asyncio
|
|
import anthropic
|
|
from typing import Any
|
|
|
|
async def run_agent(client, task_name: "str-system-str-user-str-model-str"claude-3-5-sonnet-20241022") -> dict:
|
|
"""Single async agent call"""
|
|
loop = asyncio.get_event_loop()
|
|
|
|
def _call():
|
|
return client.messages.create(
|
|
model=model,
|
|
max_tokens=2048,
|
|
system=system,
|
|
messages=[{"role": "user", "content": user}],
|
|
)
|
|
|
|
response = await loop.run_in_executor(None, _call)
|
|
return {
|
|
"task": task_name,
|
|
"output": response.content[0].text,
|
|
"tokens": response.usage.input_tokens + response.usage.output_tokens,
|
|
}
|
|
|
|
async def parallel_research(competitors: list[str], research_type: str) -> dict:
|
|
"""Fan-out: research all competitors in parallel. Fan-in: synthesize results."""
|
|
client = anthropic.Anthropic()
|
|
|
|
# FAN-OUT: spawn parallel agent calls
|
|
tasks = [
|
|
run_agent(
|
|
client,
|
|
task_name=competitor,
|
|
system=f"You are a competitive intelligence analyst. Research {competitor} and provide: pricing, key features, target market, and known weaknesses.",
|
|
user=f"Analyze {competitor} for comparison with our product in the {research_type} market.",
|
|
)
|
|
for competitor in competitors
|
|
]
|
|
|
|
results = await asyncio.gather(*tasks, return_exceptions=True)
|
|
|
|
# Handle failures gracefully
|
|
successful = [r for r in results if not isinstance(r, Exception)]
|
|
failed = [r for r in results if isinstance(r, Exception)]
|
|
|
|
if failed:
|
|
print(f"Warning: {len(failed)} research tasks failed: {failed}")
|
|
|
|
# FAN-IN: synthesize
|
|
combined_research = "\n\n".join([
|
|
f"## {r['task']}\n{r['output']}" for r in successful
|
|
])
|
|
|
|
synthesis = await run_agent(
|
|
client,
|
|
task_name="synthesizer",
|
|
system="You are a strategic analyst. Synthesize competitor research into a concise comparison matrix and strategic recommendations.",
|
|
user=f"Synthesize these competitor analyses:\n\n{combined_research}",
|
|
model="claude-3-5-sonnet-20241022",
|
|
)
|
|
|
|
return {
|
|
"individual_analyses": successful,
|
|
"synthesis": synthesis["output"],
|
|
"total_tokens": sum(r["tokens"] for r in successful) + synthesis["tokens"],
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Pattern 3: Hierarchical Delegation
|
|
|
|
**Use when:** Complex tasks with subtask discovery. Orchestrator breaks down work, delegates to specialists.
|
|
|
|
```python
|
|
# hierarchical_delegation.py
|
|
import json
|
|
import anthropic
|
|
|
|
ORCHESTRATOR_SYSTEM = """You are an orchestration agent. Your job is to:
|
|
1. Analyze the user's request
|
|
2. Break it into subtasks
|
|
3. Assign each to the appropriate specialist agent
|
|
4. Collect results and synthesize
|
|
|
|
Available specialists:
|
|
- researcher: finds facts, data, and information
|
|
- writer: creates content and documents
|
|
- coder: writes and reviews code
|
|
- analyst: analyzes data and produces insights
|
|
|
|
Respond with a JSON plan:
|
|
{
|
|
"subtasks": [
|
|
{"id": "1", "agent": "researcher", "task": "...", "depends_on": []},
|
|
{"id": "2", "agent": "writer", "task": "...", "depends_on": ["1"]}
|
|
]
|
|
}"""
|
|
|
|
SPECIALIST_SYSTEMS = {
|
|
"researcher": "You are a research specialist. Find accurate, relevant information and cite sources when possible.",
|
|
"writer": "You are a professional writer. Create clear, engaging content in the requested format.",
|
|
"coder": "You are a senior software engineer. Write clean, well-commented code with error handling.",
|
|
"analyst": "You are a data analyst. Provide structured analysis with evidence-backed conclusions.",
|
|
}
|
|
|
|
class HierarchicalOrchestrator:
|
|
def __init__(self):
|
|
self.client = anthropic.Anthropic()
|
|
|
|
def run(self, user_request: str) -> str:
|
|
# 1. Orchestrator creates plan
|
|
plan_response = self.client.messages.create(
|
|
model="claude-3-5-sonnet-20241022",
|
|
max_tokens=1024,
|
|
system=ORCHESTRATOR_SYSTEM,
|
|
messages=[{"role": "user", "content": user_request}],
|
|
)
|
|
|
|
plan = json.loads(plan_response.content[0].text)
|
|
results = {}
|
|
|
|
# 2. Execute subtasks respecting dependencies
|
|
for subtask in self._topological_sort(plan["subtasks"]):
|
|
context = self._build_context(subtask, results)
|
|
specialist = SPECIALIST_SYSTEMS[subtask["agent"]]
|
|
|
|
result = self.client.messages.create(
|
|
model="claude-3-5-sonnet-20241022",
|
|
max_tokens=2048,
|
|
system=specialist,
|
|
messages=[{"role": "user", "content": f"{context}\n\nTask: {subtask['task']}"}],
|
|
)
|
|
results[subtask["id"]] = result.content[0].text
|
|
|
|
# 3. Final synthesis
|
|
all_results = "\n\n".join([f"### {k}\n{v}" for k, v in results.items()])
|
|
synthesis = self.client.messages.create(
|
|
model="claude-3-5-sonnet-20241022",
|
|
max_tokens=2048,
|
|
system="Synthesize the specialist outputs into a coherent final response.",
|
|
messages=[{"role": "user", "content": f"Original request: {user_request}\n\nSpecialist outputs:\n{all_results}"}],
|
|
)
|
|
return synthesis.content[0].text
|
|
|
|
def _build_context(self, subtask: dict, results: dict) -> str:
|
|
if not subtask.get("depends_on"):
|
|
return ""
|
|
deps = [f"Output from task {dep}:\n{results[dep]}" for dep in subtask["depends_on"] if dep in results]
|
|
return "Previous results:\n" + "\n\n".join(deps) if deps else ""
|
|
|
|
def _topological_sort(self, subtasks: list) -> list:
|
|
# Simple ordered execution respecting depends_on
|
|
ordered, remaining = [], list(subtasks)
|
|
completed = set()
|
|
while remaining:
|
|
for task in remaining:
|
|
if all(dep in completed for dep in task.get("depends_on", [])):
|
|
ordered.append(task)
|
|
completed.add(task["id"])
|
|
remaining.remove(task)
|
|
break
|
|
return ordered
|
|
```
|
|
|
|
---
|
|
|
|
## Handoff Protocol Template
|
|
|
|
```python
|
|
# Standard handoff context format — use between all agents
|
|
@dataclass
|
|
class AgentHandoff:
|
|
"""Structured context passed between agents in a workflow."""
|
|
task_id: str
|
|
workflow_id: str
|
|
step_number: int
|
|
total_steps: int
|
|
|
|
# What was done
|
|
previous_agent: str
|
|
previous_output: str
|
|
artifacts: dict # {"filename": "content"} for any files produced
|
|
|
|
# What to do next
|
|
current_agent: str
|
|
current_task: str
|
|
constraints: list[str] # hard rules for this step
|
|
|
|
# Metadata
|
|
context_budget_remaining: int # tokens left for this agent
|
|
cost_so_far_usd: float
|
|
|
|
def to_prompt(self) -> str:
|
|
return f"""
|
|
# Agent Handoff — Step {self.step_number}/{self.total_steps}
|
|
|
|
## Your Task
|
|
{self.current_task}
|
|
|
|
## Constraints
|
|
{chr(10).join(f'- {c}' for c in self.constraints)}
|
|
|
|
## Context from Previous Step ({self.previous_agent})
|
|
{self.previous_output[:2000]}{"... [truncated]" if len(self.previous_output) > 2000 else ""}
|
|
|
|
## Context Budget
|
|
You have approximately {self.context_budget_remaining} tokens remaining. Be concise.
|
|
"""
|
|
```
|
|
|
|
---
|
|
|
|
## Error Recovery Patterns
|
|
|
|
```python
|
|
import time
|
|
from functools import wraps
|
|
|
|
def with_retry(max_attempts=3, backoff_seconds=2, fallback_model=None):
|
|
"""Decorator for agent calls with exponential backoff and model fallback."""
|
|
def decorator(fn):
|
|
@wraps(fn)
|
|
def wrapper(*args, **kwargs):
|
|
last_error = None
|
|
for attempt in range(max_attempts):
|
|
try:
|
|
return fn(*args, **kwargs)
|
|
except Exception as e:
|
|
last_error = e
|
|
if attempt < max_attempts - 1:
|
|
wait = backoff_seconds * (2 ** attempt)
|
|
print(f"Attempt {attempt+1} failed: {e}. Retrying in {wait}s...")
|
|
time.sleep(wait)
|
|
|
|
# Fall back to cheaper/faster model on rate limit
|
|
if fallback_model and "rate_limit" in str(e).lower():
|
|
kwargs["model"] = fallback_model
|
|
raise last_error
|
|
return wrapper
|
|
return decorator
|
|
|
|
@with_retry(max_attempts=3, fallback_model="claude-3-haiku-20240307")
|
|
def call_agent(model, system, user):
|
|
...
|
|
```
|
|
|
|
---
|
|
|
|
## Context Window Budgeting
|
|
|
|
```python
|
|
# Budget context across a multi-step pipeline
|
|
# Rule: never let any step consume more than 60% of remaining budget
|
|
|
|
CONTEXT_LIMITS = {
|
|
"claude-3-5-sonnet-20241022": 200_000,
|
|
"gpt-4o": 128_000,
|
|
}
|
|
|
|
class ContextBudget:
|
|
def __init__(self, model: str, reserve_pct: float = 0.2):
|
|
total = CONTEXT_LIMITS.get(model, 128_000)
|
|
self.total = total
|
|
self.reserve = int(total * reserve_pct) # keep 20% as buffer
|
|
self.used = 0
|
|
|
|
@property
|
|
def remaining(self):
|
|
return self.total - self.reserve - self.used
|
|
|
|
def allocate(self, step_name: "str-requested-int-int"
|
|
allocated = min(requested, int(self.remaining * 0.6)) # max 60% of remaining
|
|
print(f"[Budget] {step_name}: allocated {allocated:,} tokens (remaining: {self.remaining:,})")
|
|
return allocated
|
|
|
|
def consume(self, tokens_used: int):
|
|
self.used += tokens_used
|
|
|
|
def truncate_to_budget(text: str, token_budget: int, chars_per_token: float = 4.0) -> str:
|
|
"""Rough truncation — use tiktoken for precision."""
|
|
char_budget = int(token_budget * chars_per_token)
|
|
if len(text) <= char_budget:
|
|
return text
|
|
return text[:char_budget] + "\n\n[... truncated to fit context budget ...]"
|
|
```
|
|
|
|
---
|
|
|
|
## Cost Optimization Strategies
|
|
|
|
| Strategy | Savings | Tradeoff |
|
|
|---|---|---|
|
|
| Use Haiku for routing/classification | 85-90% | Slightly less nuanced judgment |
|
|
| Cache repeated system prompts | 50-90% | Requires prompt caching setup |
|
|
| Truncate intermediate outputs | 20-40% | May lose detail in handoffs |
|
|
| Batch similar tasks | 50% | Latency increases |
|
|
| Use Sonnet for most, Opus for final step only | 60-70% | Final quality may improve |
|
|
| Short-circuit on confidence threshold | 30-50% | Need confidence scoring |
|
|
|
|
---
|
|
|
|
## Common Pitfalls
|
|
|
|
- **Circular dependencies** — agents calling each other in loops; enforce DAG structure at design time
|
|
- **Context bleed** — passing entire previous output to every step; summarize or extract only what's needed
|
|
- **No timeout** — a stuck agent blocks the whole pipeline; always set max_tokens and wall-clock timeouts
|
|
- **Silent failures** — agent returns plausible but wrong output; add validation steps for critical paths
|
|
- **Ignoring cost** — 10 parallel Opus calls is $0.50 per workflow; model selection is a cost decision
|
|
- **Over-orchestration** — if a single prompt can do it, it should; only add agents when genuinely needed
|