How to Build a Multi-Agent Team

When I wrote the original version of this post, I used a fictional content team to illustrate what a good orchestrator prompt looks like. I called the team lead Marco. He managed a researcher, a writer, and an editor. He received briefs, broke them into tasks, assigned them to the right person, and made sure each handoff contained what the next person needed to start without guessing.

I published the post without noticing that I had described my actual setup. I have an agent called Marco. He handles my content operation. The fictional example was real. I just hadn’t looked at it directly yet.

A single AI assistant stretched across different types of work gets mediocre at all of them. You ask it to research a topic and write about it and you get something adequate. The research doesn’t go deep enough. The writing is generic. Not because the model is bad, but because it’s switching contexts and balancing competing priorities in a single thread.

Split the work into a specialist researcher and a specialist writer and the output changes. The researcher goes after counterarguments and primary sources. The writer gets a real brief and produces something that has a point of view. Same inputs, different architecture.

I run 14 agents. Each one has a defined scope and doesn’t cross into another’s territory. Sofia handles research. Marco drafts. Luca edits and gates. If Luca sends a piece back to Marco twice and it’s still not right, it escalates to me. There is no third revision round. Ren handles distribution. Giulia manages the vault. Nico handles personal coaching on work and health. Marcus runs philosophical sparring sessions. Dex, Morgan, and Miles handle fitness, personal finance, and travel planning respectively.

That roster doesn’t need a constant orchestrator. Most of the time I invoke agents directly. I want a LinkedIn post adapted, I go to Ren. I need a research brief, I go to Sofia. Vera, my orchestrator, comes in only when a task spans multiple agents and needs sequencing. Everything else routes directly.

The decision to make Vera opt-in was not the original design. Early on she was routing more automatically. The problem was that she added a layer of coordination on requests that didn’t need coordination. If I asked for a single-agent task and Vera intercepted it, I got a planning step before the work. That is not a multiplier. That is latency.

Her current brief reads: “You do not intercept ambiguous requests on your own.” I name her explicitly when I want her. Otherwise the system routes directly.

This matters because the failure mode of most orchestrator designs is over-orchestration. The orchestrator becomes the bottleneck. Every request passes through a coordination step regardless of complexity. The result is a slower system that adds overhead without adding judgment.

The other failure mode is under-briefing. Most orchestrator prompts focus on routing: receive a task, decide which agent handles it, send it there. That is the easy part. What most orchestrators miss is what Vera’s prompt calls out explicitly: “what each agent needs to know to do the job right.”

Routing sends a task. Briefing gives an agent what they need to start without guessing. A researcher who gets “write a brief on multi-agent systems” will produce something different from one who gets the target audience, the specific angle, the counterarguments to address, and the examples to avoid because they’ve already been covered. Same agent, different brief, very different output.

The distinction also applies to matching agent type to problem type, not just task type. I have been planning a trip to Penang for July. I could route the whole thing through Miles. Miles handles logistics, itineraries, budget breakdowns. But at some point the question stopped being about what to do in Penang and started being about whether going alone is actually what I want right now. That is not a travel problem. So I flagged Nico instead. Nico is a personal coaching agent. The task on paper was trip planning. The real problem was something closer to how I want to spend my time and who I want to spend it with. Miles cannot help with that. Nico can.

That is a different kind of design question than most agent writeups ask. It is not “which agent is most capable.” It is “which agent type fits the problem type.”

The system also has a technical enforcement layer that I didn’t plan for upfront. My agents violated a hard voice rule repeatedly: no em-dashes. It was in every brief as a constraint. They kept producing them. At some point I stopped treating it as an instruction problem and deployed a Stop hook, a shell script that blocks the em-dash character before delivery. Instruction is one layer of control. Technical enforcement is another.

None of this was designed in advance. Vera became opt-in because automatic orchestration added complexity I didn’t need. The Stop hook came after the em-dash problem refused to die as an instruction.

The insight isn’t that you need 14 agents. It’s that the agents are the easy part. Any capable model can be scoped to a domain and given a role. What takes longer to get right is the orchestration layer: when it runs, what it hands off, and what each agent actually needs to start working. Routing is table stakes. Briefing is what makes the system work.

If you’re building toward something like this, start smaller than you think you need. One specialist agent for one type of work you do regularly. See whether the output quality improves and whether the interaction adds or removes friction. Add a second agent when you hit a clear handoff point. Build the orchestration layer last, once you understand what actually needs coordinating.

The fictional Marco who managed my content team in the first draft of this post was doing the right job. I just hadn’t noticed yet that he was already real.