The 12 Building Blocks
The platform is an abstraction layer over 12 AgentCore concepts. Each one maps from a YAML declaration in your blueprint to a fully wired AWS service. The developer declares what; the platform handles how.
Overview
+-----------------+ +--------------+ +--------------+
| Agent (microVM) |---->| Gateway |---->| Lambda fn |
| Long-running | | (MCP proxy) | | Short, fast |
| Stateful | | | | Stateless |
| Streaming | | | | < 30s |
+-----------------+ +--------------+ +--------------+
Agents live on Runtime. Tools live on Lambda (or any backend). Gateway bridges them. Everything else — Memory, Identity, Policy, Observability, Evaluation — wraps around the agent at the Runtime layer.
Block 1: Runtime
AgentCore Runtime hosts agents in isolated microVMs per session. The contract: expose POST /invocations and GET /ping on port 8080. Runtime handles scaling, warm pools, session routing, TLS, and lifecycle management. Each user session gets its own isolated microVM; sessions auto-terminate after an idle timeout (default: 15 minutes) or a maximum lifetime of 8 hours.
What the platform does: BlueprintLoader reads the agent YAML, resolves all dependencies, and produces a Strands Agent wired to @app.entrypoint. The developer never writes the entrypoint — the platform generates it from configuration.
runtime:
type: agentcore # microVM hosting (not Lambda)
max_iterations: 10
idle_timeout_minutes: 15
network_mode: PRIVATE # VPC-only; use PUBLIC for internet-accessible agents
protocol: HTTP # or MCP for hosting an MCP server on Runtime
Key point: Agents are not Lambda functions. Lambda is for tools — short, stateless, fast functions called by agents. Agents are stateful, long-running, and session-oriented. Runtime is the correct host.
Blueprint YAML → BlueprintLoader → AgentCoreApp → Docker → ECR → AgentCore Runtime → microVM
Block 2: Gateway
Gateway is a protocol translator that makes any backend look like an MCP server. Agents speak MCP to Gateway; Gateway speaks whatever the backend requires — Lambda invocations, REST calls, OpenAPI schemas, Smithy APIs, or other MCP servers.
What the platform does: The blueprint’s tools: section declares MCP server names. The platform registers these as Gateway targets and configures the agent to consume them through a single Gateway URL via Strands MCPClient. Domain repos never manage MCP connections directly.
tools:
- mcp: data-tools-mcp
tools: [get_record, search_records]
- mcp: artifact-store-mcp
tools: [create_artifact, get_artifact]
- builtin: code_interpreter
- builtin: browser
Data flow:
graph LR
A[Strands Agent] -->|MCP| GW[Gateway]
GW -->|IAM + invoke| L1[Lambda — data tools]
GW -->|MCP| L2[Runtime MCP Server]
GW -->|HTTP| L3[OpenAPI REST API]
GW -->|Cedar| PE[Policy Engine]
The agent sees all tools — local, Lambda-backed, REST-backed, built-in — as equivalent MCP tools. Gateway handles routing, auth, and protocol translation transparently.
Block 3: Identity
Four authentication patterns, all declared in YAML:
| Pattern | Use Case | YAML Key |
|---|---|---|
| Inbound JWT | Validate who can call your agent | identity.authorizer |
| Outbound API Key | Agent needs a third-party API key | identity.credentials[].type: api_key |
| 3-Legged OAuth | Agent needs user’s OAuth token (Google, GitHub, etc.) | identity.credentials[].type: oauth_3lo |
| M2M | Agent-to-agent authentication | identity.credentials[].type: m2m |
What the platform does: Reads the identity block, configures Runtime JWT validation (so invalid tokens are rejected before your code runs), registers credential providers with AgentCore Identity, and injects the appropriate decorators into the generated agent code.
identity:
authorizer:
type: cognito_jwt
user_pool_id: ${COGNITO_POOL_ID}
client_id: ${COGNITO_CLIENT_ID}
credentials:
- name: external-api-key
type: api_key
provider: external-apikey-provider
- name: calendar-access
type: oauth_3lo
scopes: ["https://www.googleapis.com/auth/calendar.readonly"]
Credentials are stored in Secrets Manager and fetched at runtime by the Identity service. They never appear in your codebase, environment variables, or container image.
Block 4: Memory
AgentCore Memory provides two tiers: short-term (raw conversation events with TTL) and long-term (strategy-extracted knowledge in pgvector with semantic retrieval).
What the platform does: The blueprint declares memory strategies. The platform generates a Strands HookProvider that loads conversation history and semantic memories when the agent initialises, and persists each turn as it is added.
memory:
# (no mode field -- presence of strategies enables memory automatically)
strategies:
- type: USER_PREFERENCE
name: PreferenceLearner
namespace: "user/{actorId}/preferences/"
- type: SEMANTIC
name: FactExtractor
namespace: "user/{actorId}/facts/"
- type: SUMMARY
name: Summarizer
namespace: "user/{actorId}/{sessionId}/summaries/"
event_expiry_days: 30
short_term_k: 5
Memory tiers:
create_event() --> Short-Term (Raw events, TTL)
|
+-- async extraction (~30s) --> Long-Term (pgvector)
|
retrieve_memories()
semantic similarity search
Memory branching for multi-agent pipelines: each sub-agent writes to its own branch; the coordinator reads from all branches. Declared in the workflow blueprint, not coded by the developer.
Block 5: Tools — Code Interpreter and Browser
AgentCore provides two managed built-in tools: Code Interpreter (sandboxed Python and shell execution) and Browser (hosted Chromium with Nova Act). These are AWS-managed services — the platform registers them as Gateway targets when declared.
tools:
- builtin: code_interpreter
- builtin: browser
- mcp: my-domain-mcp
tools: [domain_tool_1]
The agent sees all tools — local, Gateway-routed, Code Interpreter, Browser — as equivalent. It does not know or care where each tool runs.
Block 6: Observability
AgentCore uses OTEL auto-instrumentation. Every LLM call, tool call, and error is traced end-to-end and sent to CloudWatch GenAI Observability.
What the platform does: The generated Dockerfile includes aws-opentelemetry-distro and wraps the entrypoint with opentelemetry-instrument. The blueprint’s trace_attributes appear on every span. Domain developers get full observability without writing tracing code.
observability:
enabled: true
trace_attributes:
environment: production
agent.version: "2.1.0"
tags: ["customer-support", "tier-1"]
langfuse:
enabled: true
audit_log:
enabled: true
ttl_days: 1825
A trace for a single invocation captures:
Trace: session_abc / invocation_1
+-- Agent Invocation (2.3s total)
| +-- LLM Call #1 (0.8s) — tokens: 142 in / 67 out
| | +-- Tool Decision: get_record(id="123")
| +-- Tool Call: get_record (0.1s)
| +-- LLM Call #2 (0.6s) — tokens: 203 in / 89 out
| +-- Final Response
Block 7: Evaluation
AgentCore Evaluation reads OTEL traces and scores agent behaviour with LLM-as-judge. 12 built-in evaluators cover response quality, task completion, tool usage accuracy, and safety.
What the platform does: The blueprint declares which evaluators to run and at what sampling rate. The platform configures online evaluation (continuous production monitoring) against the agent’s live OTEL traces.
evaluation:
online:
sampling_rate: 100
evaluators:
- Builtin.GoalSuccessRate
- Builtin.Correctness
- Builtin.ToolSelectionAccuracy
custom_evaluators:
- name: policy_compliance
instructions: "Did the agent follow the domain policy? Score 1.0 if fully compliant..."
scale: [1.0, 0.5, 0.0]
| Category | Count | Built-in Evaluators |
|---|---|---|
| Response quality | 6 | Correctness, Completeness, Faithfulness, Helpfulness, Coherence, Relevance |
| Task completion | 1 | GoalSuccessRate |
| Tool usage | 2 | ToolSelectionAccuracy, ToolParameterAccuracy |
| Safety | 3 | Harmlessness, Harmfulness, Stereotyping |
| Total | 12 |
Block 8: Policy
Cedar policies on Gateway control who can call which tools with which parameters. The default action is DENY — you explicitly permit what is allowed.
What the platform does: The blueprint declares access rules in a simplified format. The platform generates Cedar policies and attaches them to the Gateway’s policy engine.
policy:
engine: DomainPolicies
mode: ENFORCE # or LOG_ONLY for testing without blocking
rules:
- name: write_limit
allow: create_record
when: "context.input.content.length <= 50000"
- name: admin_only_delete
deny: delete_record
unless: "principal.scope.contains('group:Admins')"
Policy operates at the Gateway level, not the Runtime level. The agent calls tools normally; policy silently allows or denies based on the end-user’s JWT claims.
Block 9: Strands Integration
Strands is the primary agent framework because it has the deepest AgentCore integration: native model providers for Bedrock, Anthropic, LiteLLM, and Vertex AI; HookProvider for Memory; MCPClient for Gateway; A2AServer for agent-to-agent communication; and trace_attributes for OTEL.
What the platform does: BlueprintLoader produces a fully wired Strands Agent with:
- The correct model class (
BedrockModel,AnthropicModel,LiteLLMModel, orGeminiModel) configured from themodel:block — provider selection is driven bymodel.providerin the blueprint - Gateway tools via
MCPClientfrom thetools:block - Memory
HookProviderfrom thememory:block - Identity decorators from the
identity:block - OTEL
trace_attributesfrom theobservability:block - All wrapped in
@app.entrypointfor AgentCore Runtime
The developer declares all of this in YAML. The platform assembles it. The inference provider is a one-line change in the blueprint — no code change required to switch from Bedrock to LiteLLM or Anthropic. See Inference Providers for the full provider guide.
Block 10: Agent-to-Agent (A2A)
A2A lets agents discover and call each other via a standardised protocol. Each agent publishes an agent card at /.well-known/agent.json. The A2A port is configured via runtime.a2a_port in the blueprint or the A2A_PORT environment variable.
What the platform does: When a blueprint declares multi_agent:, the platform generates an A2AServer for the agent, registers M2M credential providers for cross-agent auth, and wraps remote agent calls as Strands @tool functions. The coordinator agent sees specialist agents as regular tools.
multi_agent:
type: graph
role: coordinator
nodes:
- agent_ref: search-specialist
a2a_url: ${SEARCH_AGENT_URL}
- agent_ref: analysis-specialist
a2a_url: ${ANALYSIS_AGENT_URL}
Multi-agent topology:
graph TD
C[Coordinator Agent<br/>Runtime A] -->|A2A| S1[Search Specialist<br/>Runtime B]
C -->|A2A| S2[Analysis Specialist<br/>Runtime C]
S1 -->|MCP| GW[Gateway]
S2 -->|MCP| GW
Each specialist can use a different model, different tools, or even be deployed in a different AWS account. A2A abstracts the transport; coordinators see specialists as opaque tools.
Block 11: Infrastructure as Code
Terraform modules are the primary consumption unit. Domain repos use module "platform" { source = "..." } to deploy the entire stack.
# Domain repo: infra/main.tf
module "platform" {
source = "git::https://github.com/org/aws-agent-platform//modules/platform?ref=v1.0.0"
environment = "production"
vpc_id = module.network.vpc_id
agents_config = "./blueprints/agents/"
}
module "domain_agents" {
source = "git::https://github.com/org/aws-agent-platform//modules/agents?ref=v1.0.0"
platform_outputs = module.platform.outputs
blueprints_dir = "./blueprints/"
}
The platform infrastructure (Gateway, Memory, Identity, Observability, data stores) is deployed once per account. Every domain repo sharing the same account shares the same platform stack.
Block 12: Blueprints
Blueprints are the platform’s core innovation: YAML files that declare everything an agent needs, and the platform assembles it.
Three blueprint types:
| Type | Declares | Produces |
|---|---|---|
| Agent | Model, tools, prompt, memory, identity, policy, observability | AgentCore Runtime container with full Strands agent |
| Strategy | Trigger conditions, parameter logic, required inputs | Evaluated by a strategy-evaluation agent |
| Workflow | Multi-agent DAG with parallel branches, choice routing, retry/catch | Step Functions state machine |
This is the platform’s differentiator. Everything else (Gateway, Memory, Identity, Policy) is AWS-managed infrastructure. The blueprint layer turns “12 separate AWS services” into “one YAML file per agent.”
Execution Mode Isolation
EXECUTION_MODE is a first-class concept that affects every building block simultaneously. It controls which prompts are resolved, which data sources are queried, and which execution targets receive tool calls.
| Mode | Prompts | Data Sources | Execution Targets |
|---|---|---|---|
simulation | Simulation-mode variants | Synthetic / sandboxed data | Dry-run handlers |
staging | Staging variants | Staging data stores | Staging backends |
production | Production variants | Live data | Live backends |
Every prompt resolution, Gateway target call, and Memory namespace is mode-aware. Switching modes switches the entire agent’s behaviour end-to-end — from the PromptRegistry returning different prompt versions to the Gateway routing to different backend environments.
This means you can run a full agent pipeline in simulation mode against synthetic data with dry-run handlers, then promote to staging for integration testing, and finally to production — without changing any agent code or blueprint YAML. The mode is set via the EXECUTION_MODE environment variable at deployment time.
Next Steps
- Platform vs. Domain — responsibility matrix and directory structure
- How It Works — end-to-end flows with Mermaid diagrams
- First Agent — build all 12 blocks into a running agent