One server. Unlimited agents.
Most managed AI tools charge per agent. You end up paying for three or four separate subscriptions to run specialized agents — a researcher, an operator, a support triage bot.
Hermes OS takes a different approach: you pay for compute, not for agents. Spin up unlimited agent profiles on a single instance. Each profile has its own memory, its own tools, its own personality — and they can coordinate with each other.
Why multiple agents outperform one generalist
A single agent that handles everything — research, coding, customer support, content writing — has to context-switch constantly. Its system prompt grows bloated trying to cover every role. The memory store accumulates unrelated history that competes for context space on every task.
Specialized agents are more focused. A research agent configured for competitive intelligence has a system prompt, tool set, and memory structure optimized for that job. A coding agent has different tool access and different working memory. Each one is sharper at its task than a generalist would be.
This is how larger AI teams are being built in practice in 2026 — not one big agent, but a portfolio of focused ones with defined handoff points between them.
Isolated memory, shared infrastructure
Each agent profile on Hermes OS has a completely separate memory store. The research agent does not see the coding agent's task history and vice versa. This prevents the interference that happens when a generalist agent tries to apply patterns from one domain to an unrelated task.
At the infrastructure level, the profiles share compute resources — the same vCPU pool and RAM allocation. This is efficient: not every agent is active simultaneously, so the compute that would sit idle on a per-agent plan is instead available across the full profile set.
Practical configurations
The most common setup on the Pro plan (2 vCPU, 4 GB RAM): two or three agent profiles running a mix of research and coding tasks at different schedules. Morning: the research agent runs its daily brief. Afternoon: the coding agent handles a batch of file processing tasks. Overlap is minimal and the compute is shared efficiently.
On the Power plan (4 vCPU, 8 GB RAM): four to six agents running in parallel. A support agent triages incoming messages while a research agent runs daily briefs and a coding agent handles a batch of file-processing tasks. Burst CPU on the shared pool keeps things responsive when several agents wake at once.
- Unlimited agent profiles per instance — no per-agent pricing
- Each agent has isolated memory, tools, and configuration
- Manage all agents from a single dashboard
- Individual scheduling per agent profile
- Compute shared efficiently across inactive profiles
- Burst CPU on Power when capacity allows
Is there a limit on how many agent profiles I can create?
No hard limit. The practical limit is your compute pool — vCPU and RAM. On the Pro plan, 2-3 active agents is comfortable. On the Power plan, 6-8.
Can different agents use different AI models?
Yes. Each agent profile can be configured with its own model preference. One agent can use Claude Sonnet while another uses Haiku for cheaper high-frequency tasks — all from the same API key.
What is the difference between Pro and Power for multi-agent use?
Both support unlimited profiles. Power (4 vCPU, 8 GB RAM) adds the shared key vault, agent coordination features, and enough compute headroom to run 4-6 agents simultaneously without resource contention.
Can two agents write to the same output — like a shared document or spreadsheet?
This depends on your external tool configuration. Agents can be given access to the same Google Sheet, Notion database, or file share. Coordination on write timing is handled at the task level — you define which agent writes first and what the handoff looks like.