Claude Managed Agents: Anthropic Now Runs Your Agents For You

If you’ve been building agents with the Claude API, you know the drill. You write the agent loop. You manage the tool execution. You set up containers or sandboxes. You handle prompt caching, context windows, retries, and streaming. You wire up file systems and networking. It works, but it’s a lot of plumbing.

Anthropic just said “we’ll handle all of that.”

Claude Managed Agents is a new beta API that lets you spin up autonomous Claude agents running in Anthropic’s cloud. They get their own container, their own tools, their own environment. You just define what the agent should do, point it at a task, and stream the results back.

I’ve been playing with it since the docs went live and it’s the most significant thing Anthropic has shipped for developers since Claude Code itself. Think of it this way: if the Messages API is like renting a GPU, Managed Agents is like hiring an employee who comes with their own laptop and office.

Let’s break down how it works, what you can build with it, and what I’ve learned so far.

The Four Building Blocks

Managed Agents is built on four core concepts that work together. If you’ve used Claude Code or the Agent SDK, the mental model will feel familiar.

1. Agent

An Agent is a reusable configuration. It defines which model to use, the system prompt, which tools are available, and any MCP servers or skills. You create one via the API and get back an agent ID you can reuse across sessions.

from anthropic import Anthropic
client = Anthropic()

agent = client.beta.agents.create(
    name="Code Reviewer",
    model="claude-sonnet-4-6",
    system="You are a senior code reviewer. Review code for bugs, performance issues, and security vulnerabilities. Be direct and specific.",
    tools=[{"type": "agent_toolset_20260401"}],
)

print(agent.id)  # ag_01ABC...

The agent_toolset_20260401 tool type gives your agent access to the full built-in toolset: Bash, file operations (Read, Write, Edit), search (Glob, Grep), and web access (Web Search, Web Fetch). You can also restrict which tools are available if you don’t want your agent running shell commands, for example.

Every time you update an agent, it automatically creates a new version. You can list version history and pin sessions to specific versions. This is great for production use where you want to test changes before rolling them out.

2. Environment

An Environment is the container template your agent runs in. This is where you configure packages, networking rules, and mounted files.

environment = client.beta.environments.create(
    name="python-dev",
    config={
        "type": "cloud",
        "packages": {
            "pip": ["pytest", "requests", "pandas"],
            "apt": ["git", "curl"],
        },
        "networking": {"type": "unrestricted"},
    },
)

You can pre-install packages via pip, npm, apt, cargo, gem, and go. The packages get cached across sessions sharing the same environment, so subsequent sessions start up faster.

Networking can be unrestricted (default) or limited with an allowlist of domains. If you’re building an agent that processes sensitive data, the limited option lets you lock down exactly where it can reach.

You can also mount GitHub repositories directly into the environment:

environment = client.beta.environments.create(
    name="my-repo-env",
    config={
        "type": "cloud",
        "github": {
            "repository": "myorg/myrepo",
            "branch": "main",
        },
        "packages": {
            "pip": ["pytest"],
        },
        "networking": {"type": "unrestricted"},
    },
)

This clones the repo into the container when a session starts. Combine this with a code review agent and you’ve got yourself an automated PR reviewer.

3. Session

A Session is a running instance of an agent in an environment. Each session gets its own isolated container with its own filesystem, processes, and network stack. This is where the work actually happens.

session = client.beta.sessions.create(
    agent=agent.id,
    environment_id=environment.id,
)

print(session.id)      # ses_01XYZ...
print(session.status)  # "running"

Sessions persist until you explicitly close them or they time out. While a session is running, your agent maintains state across multiple messages, just like a conversation in Claude Code.

4. Events

Events are how you communicate with a running session. You send user messages and stream back agent responses, tool use notifications, and status updates via Server-Sent Events (SSE).

with client.beta.sessions.events.stream(session.id) as stream:
    # Send a message to the agent
    client.beta.sessions.events.send(session.id, events=[{
        "type": "user.message",
        "content": [{"type": "text", "text": "Review the code in src/ for security issues"}],
    }])

    # Stream the response
    for event in stream:
        if event.type == "agent.message":
            for block in event.content:
                print(block.text, end="")
        elif event.type == "agent.tool_use":
            print(f"\n[Using tool: {event.name}]")
        elif event.type == "session.status_idle":
            print("\n[Agent finished]")
            break

The SSE stream gives you real-time visibility into what the agent is doing. You can see when it reads files, runs commands, searches the web, and everything in between. You can also interrupt the agent mid-execution if it goes off track.

Putting It All Together

Here’s a complete working example that creates a coding assistant, gives it a task, and streams the output:

from anthropic import Anthropic

client = Anthropic()

# 1. Create agent
agent = client.beta.agents.create(
    name="Python Developer",
    model="claude-sonnet-4-6",
    system="""You are an expert Python developer. When given a task:
1. Plan your approach first
2. Write clean, well-tested code
3. Run the tests to verify everything works
4. Fix any issues before reporting completion""",
    tools=[{"type": "agent_toolset_20260401"}],
)

# 2. Create environment with common Python packages
environment = client.beta.environments.create(
    name="python-env",
    config={
        "type": "cloud",
        "packages": {
            "pip": ["pytest", "requests", "pydantic", "fastapi", "uvicorn"],
        },
        "networking": {"type": "unrestricted"},
    },
)

# 3. Start a session
session = client.beta.sessions.create(
    agent=agent.id,
    environment_id=environment.id,
)

# 4. Give it a task and stream the response
with client.beta.sessions.events.stream(session.id) as stream:
    client.beta.sessions.events.send(session.id, events=[{
        "type": "user.message",
        "content": [{"type": "text", "text": """
            Build a FastAPI app with:
            - A /health endpoint
            - A /fibonacci/{n} endpoint that returns the nth fibonacci number
            - Input validation (n must be between 1 and 1000)
            - Unit tests for both endpoints
            Run the tests and make sure they all pass.
        """}],
    }])

    for event in stream:
        if event.type == "agent.message":
            for block in event.content:
                if hasattr(block, 'text'):
                    print(block.text, end="")
        elif event.type == "agent.tool_use":
            print(f"\n  → {event.name}")
        elif event.type == "session.status_idle":
            break

# 5. Send a follow-up message in the same session
with client.beta.sessions.events.stream(session.id) as stream:
    client.beta.sessions.events.send(session.id, events=[{
        "type": "user.message",
        "content": [{"type": "text", "text": "Now add a /prime/{n} endpoint that checks if n is prime. Add tests for it too."}],
    }])

    for event in stream:
        if event.type == "agent.message":
            for block in event.content:
                if hasattr(block, 'text'):
                    print(block.text, end="")
        elif event.type == "session.status_idle":
            break

The agent goes through the full cycle: it plans the approach, creates the files, writes the code, installs dependencies, runs the tests, and reports back. And because the session persists, the follow-up message has full context of what was already built.

This is the same ReAct loop (Reason, Act, Observe) that powers Claude Code. The difference is you didn’t have to build any of it.

Custom Tools and MCP Servers

The built-in toolset covers a lot, but sometimes you need your agent to interact with external systems. Managed Agents supports two ways to do this.

Custom Tools

You can define custom tools with JSON Schema. When the agent decides to use one, it emits a structured request. Your application executes the tool and sends the result back.

agent = client.beta.agents.create(
    name="Deploy Assistant",
    model="claude-sonnet-4-6",
    system="You help with deployments. Use the deploy tool to trigger deployments.",
    tools=[
        {"type": "agent_toolset_20260401"},
        {
            "type": "custom",
            "name": "trigger_deploy",
            "description": "Trigger a deployment to a specified environment",
            "input_schema": {
                "type": "object",
                "properties": {
                    "service": {"type": "string", "description": "Service name"},
                    "environment": {"type": "string", "enum": ["staging", "production"]},
                    "version": {"type": "string", "description": "Version tag to deploy"},
                },
                "required": ["service", "environment", "version"],
            },
        },
    ],
)

When the agent calls trigger_deploy, you handle the event in your stream and send back the result. This lets you connect agents to your internal APIs, databases, deployment pipelines, or anything else.

MCP Servers

If you’ve been following the MCP protocol, you’ll be happy to know Managed Agents supports connecting to MCP servers. This means you can give your agent access to tools from Slack, GitHub, Jira, or any other MCP-compatible service without writing custom integration code.

Multi-Agent Orchestration (Research Preview)

This is where things get really interesting.

Managed Agents includes a research preview of multi-agent orchestration. You can create a coordinator agent that delegates tasks to specialized worker agents. Each worker runs in its own “thread” with an isolated conversation context, but they all share the same container and filesystem.

Here’s a practical example: a coordinator that delegates code review and testing to separate specialist agents.

# Create specialist agents
reviewer = client.beta.agents.create(
    name="Code Reviewer",
    model="claude-sonnet-4-6",
    system="You are an expert code reviewer. Focus on code quality, bugs, and security.",
    tools=[{"type": "agent_toolset_20260401"}],
)

tester = client.beta.agents.create(
    name="Test Writer",
    model="claude-sonnet-4-6",
    system="You write comprehensive test suites. Focus on edge cases and error handling.",
    tools=[{"type": "agent_toolset_20260401"}],
)

# Create coordinator that can delegate to specialists
coordinator = client.beta.agents.create(
    name="Tech Lead",
    model="claude-opus-4-6",
    system="""You are a tech lead managing a code review process. You have two team members:
- Code Reviewer: Reviews code for quality, bugs, and security
- Test Writer: Writes comprehensive tests

Delegate review and testing tasks to the appropriate specialist, then synthesize their findings into a final report.""",
    tools=[{"type": "agent_toolset_20260401"}],
    agents=[
        {"agent_id": reviewer.id, "name": "Code Reviewer"},
        {"agent_id": tester.id, "name": "Test Writer"},
    ],
)

The coordinator decides when to delegate. It sends tasks to workers, receives their results, and synthesizes everything into a final output. The workers share the filesystem, so the test writer can see the code the reviewer flagged and write tests targeting those specific areas.

Currently, only one level of delegation is supported (coordinator to workers). Workers can’t delegate further. But even with this limitation, the pattern is powerful. Think about CI/CD pipelines where different agents handle linting, testing, security scanning, and deployment.

How It Compares to Other Approaches

Anthropic now has several overlapping agent products. Here’s how I think about when to use what:

Approach	You manage	Best for
Messages API	Everything (loop, tools, containers)	Maximum control, custom architectures
Agent SDK	Tool execution, containers	Claude Code’s tools as a library in your app
Managed Agents	Just the prompt and task	Backend automation, CI/CD, microservices
Claude Code CLI	Nothing, it’s interactive	Interactive development
Claude Cowork	Nothing, it’s a desktop app	Non-technical users

The way I see it:

Use the Messages API when you need full control over the agent loop, want to use a different model for different steps, or have very specific requirements that don’t fit the managed model.

Use the Agent SDK when you want Claude Code’s proven tools (file editing, search, bash) but want to run everything in your own infrastructure.

Use Managed Agents when you want autonomous agents running in the cloud without managing any infrastructure. This is the sweet spot for backend automation: PR review bots, code generation pipelines, testing agents, documentation generators.

Use Claude Code when you’re a developer working interactively on your own code.

Pricing Breakdown

Let’s talk money. Managed Agents has two cost components:

Token pricing is the same as the standard API:

Model	Input	Output
Opus 4.6	$5/MTok	$25/MTok
Sonnet 4.6	$3/MTok	$15/MTok
Haiku 4.5	$1/MTok	$5/MTok

Session runtime costs $0.08 per session-hour, metered to the millisecond. You only pay for time the session status is “running.” Idle time is free.

A worked example from the docs: a 1-hour coding session with Opus 4.6 consuming 50K input tokens and 15K output tokens costs approximately $0.705 total ($0.25 input + $0.375 output + $0.08 runtime).

That’s remarkably cheap for a fully managed container with networking, pre-installed packages, and a complete tool suite. For context, a comparable setup on AWS (EC2 instance + ECS container + tool orchestration layer) would cost significantly more in both compute and engineering time.

Web search is $10 per 1,000 searches if you use it.

Prompt caching discounts apply too (cache reads at 0.1x), which matters because agent sessions tend to have long, repetitive prefixes.

What I’d Build With This

Here are the use cases I’m most excited about:

Automated PR review pipeline. Mount a GitHub repo, create a code review agent, and trigger a session on every PR. The agent reads the diff, checks for bugs and security issues, runs tests, and posts a review. You could even use multi-agent orchestration with separate reviewers for security, performance, and code quality.

Self-healing CI/CD. When a test fails in CI, spin up a managed agent session. Give it the failing test output and access to the codebase. Let it diagnose the issue, write a fix, and open a PR. For flaky tests and simple regressions, this could eliminate a huge amount of developer toil.

Documentation generation. Point an agent at a codebase and have it generate or update API documentation, README files, and changelogs. Because it has access to Bash and can actually run the code, it can verify that its examples work.

Data pipeline debugging. When a data pipeline fails, spin up an agent with access to your pipeline code, logs, and monitoring tools (via MCP). It can trace through the failure, identify the root cause, and suggest or implement a fix.

Code migration. Need to migrate from one framework to another? Create a specialized migration agent, give it the old codebase and the new framework’s docs (via web search), and let it work through the conversion file by file.

Getting Started

Installation is straightforward. The SDKs support Python, TypeScript, Java, Go, C#, Ruby, and PHP.

# Python
pip install anthropic

# TypeScript
npm install @anthropic-ai/sdk

# CLI
brew install anthropics/tap/ant

All requests need the beta header managed-agents-2026-04-01. The SDKs set this automatically when you use the client.beta.agents namespace.

You’ll also need to request access to some research preview features (multi-agent, memory, outcomes) at claude.com/form/claude-managed-agents.

The full documentation is at platform.claude.com/docs/en/managed-agents/overview.

The Bigger Picture

Managed Agents represents a significant shift in how Anthropic positions itself. They’re moving from being a model provider (you call our API and get tokens back) to being an agent platform (you define a task and we handle the execution).

The analogy I keep coming back to is AWS. Amazon started with EC2 (raw compute, you manage everything) and gradually moved up the stack to Lambda (managed functions, you just write the code). Anthropic started with the Messages API (raw model access, you build the loop) and is now at Managed Agents (managed execution, you define the agent).

And the multi-agent research preview hints at where this is going next. Teams of specialized agents, coordinated by a manager, all running in Anthropic’s cloud. We’re not far from being able to define an entire software team as a set of agent configurations.

If you’ve been building agents from scratch or wrestling with context engineering to keep your agents on track, Managed Agents removes a massive amount of infrastructure work. The agent loop, the tool execution, the sandboxing, the prompt caching, the context compaction. All handled.

Go try it out and let me know what you build.