Workflow Blueprint

A workflow blueprint declares a multi-agent pipeline as a YAML file. The modules/workflows Terraform module reads these files and generates AWS Step Functions state machines — one state machine per blueprint. Workflow blueprints support sequential agent steps, parallel branches, choice routing, callback-pattern wait states, and EventBridge triggers.

Required fields: id, version, name. All other fields have safe defaults.

Workflow blueprint files live under blueprints/workflows/*.yaml in a domain repo.


Top-Level Identity Fields

id: research-synthesis-pipeline    # Unique workflow identifier. Kebab-case by convention.
name: Research and Synthesis       # Human-readable name for dashboards. Required.
version: "1.0.0"                   # Semantic version. Required.
description: |                     # Optional description.
  Parallel data collection followed by synthesis and
  confidence-based routing to the output handler.
timeout_minutes: 60                # Overall workflow timeout. Default: 60. Must be > 0.

trigger: Block

Configures how the workflow is started. Three trigger types are supported.

# Schedule trigger — runs on an EventBridge cron or rate expression.
trigger:
  type: schedule
  schedule: "cron(0 8 * * ? *)"   # EventBridge schedule expression.
  timezone: "UTC"                  # Optional IANA timezone.
# Event trigger — reacts to an EventBridge event pattern.
trigger:
  type: event
  event_pattern:
    source: ["my.application"]
    detail-type: ["DataReady"]
# Manual trigger — Step Functions StartExecution or API call only.
trigger:
  type: manual

For scheduled workflows, the platform creates an aws_cloudwatch_event_rule and wires an aws_cloudwatch_event_target that injects trigger, scheduled_time, workflow, and environment into the execution input.


states: Block

The states: list defines the workflow DAG. Each entry is one of seven state types:

Type Description
task Invokes an agent via AgentCore Runtime
choice Routes to different states based on a condition
parallel Runs multiple branches concurrently
wait Pauses for a fixed duration
wait_for_token Pauses until an external callback sends a task token (Step Functions callback pattern)
succeed Terminal success state
fail Terminal error state

Task State (Sequential Agent Step)

states:
  - id: ResearchStep              # State name. PascalCase by convention.
    type: task
    agent_ref: researcher         # Agent blueprint ID. Resolved to Runtime ARN by Terraform.
    next: SynthesisStep           # Next state ID. Omit for terminal states.
    prompt: "$.input.query"       # JSONPath to the prompt field in execution input.
    retry_max: 3                  # Maximum retry attempts (>= 1). Shorthand for simple retry.
    input_mapping:                # Optional. Map state input keys to agent payload keys.
      query: "$.input.query"
      context: "$.results.previous"

Terraform resolves the agent’s Runtime ARN from var.agent_runtime_arns (passed from module.agents.runtime_arns) and generates a Step Functions SDK integration task using arn:aws:states:::aws-sdk:bedrockagentcore:invokeAgentRuntime. Results are written to $.results.<agent_id> in the execution state.

Parallel State

Runs multiple agents simultaneously. All branches must complete before the workflow continues. Results are merged into $.parallel_results.

  - id: ParallelAnalysis
    type: parallel
    branches:
      - states:
          - id: AnalystA
            type: task
            agent_ref: analyst-a
            prompt: "$.input.query"
      - states:
          - id: AnalystB
            type: task
            agent_ref: analyst-b
            prompt: "$.input.query"
    next: SynthesisStep

Choice State

Routes to different next states based on the execution context.

  - id: RouteByConfidence
    type: choice
    choices:
      - condition:
          path: "$.results.researcher.confidence"
          op: ">="
          value: 0.8
        next: HighConfidenceHandler
      - condition:
          path: "$.results.researcher.confidence"
          op: ">="
          value: 0.5
        next: MediumConfidenceHandler
    default: LowConfidenceHandler  # Fallback if no condition matches.

Succeed State

Terminal state that records explicit success.

  - id: PipelineComplete
    type: succeed

Fail State

Terminal state that records an error.

  - id: ValidationFailed
    type: fail
    error: "ValidationError"
    cause: "Input did not pass validation checks."

Wait for Token State (Callback Pattern)

Pauses the workflow until an external system calls SendTaskSuccess or SendTaskFailure with the task token. This implements the Step Functions callback pattern.

  - id: AwaitHumanReview
    type: wait_for_token
    heartbeat_seconds: 3600        # Timeout if no heartbeat received within this interval.
    next: PostReviewStep

retry: and catch: (per state)

Each task state supports standard Step Functions retry and catch policies alongside the retry_max shorthand:

  - id: CriticalStep
    type: task
    agent_ref: critical-agent
    prompt: "$.input"
    retry:                         # Full retry config — list of retry policies.
      - ErrorEquals: ["States.TaskFailed"]
        IntervalSeconds: 5
        MaxAttempts: 3
        BackoffRate: 2.0
    catch:                         # Catch config — list of catch policies.
      - ErrorEquals: ["States.ALL"]
        Next: ErrorHandler
    next: NextStep

memory_branching: Block

Configures AgentCore Memory branching for multi-agent workflows. When enabled, each agent step operates on an isolated memory branch.

memory_branching:
  enabled: true                    # Enable per-state memory branches. Default: false.
  merge_strategy: union            # union | latest | coordinator_wins | none. Default: union.
  branch_namespace: "{sessionId}/branches/{stateId}"  # Namespace template.
Merge Strategy Behaviour
union Merge all branch memories into the main namespace
latest Keep only the most recent branch’s memories
coordinator_wins Coordinator’s memories take precedence on conflicts
none No merge — branches remain isolated

Complete Multi-Agent Pipeline Example

id: research-synthesis-pipeline
name: Research and Synthesis Pipeline
version: "1.1.0"
description: |
  Parallel data collection, followed by synthesis and
  confidence-based routing to a final output handler.
timeout_minutes: 45

trigger:
  type: manual

states:
  # Step 1: Validate the incoming request.
  - id: ValidateRequest
    type: task
    agent_ref: validator
    next: ParallelCollection
    retry_max: 2
    prompt: "$.input"

  # Step 2: Run three collection agents in parallel.
  - id: ParallelCollection
    type: parallel
    branches:
      - states:
          - id: WebResearch
            type: task
            agent_ref: web-researcher
            prompt: "$.input.query"
      - states:
          - id: DatabaseResearch
            type: task
            agent_ref: database-researcher
            prompt: "$.input.query"
      - states:
          - id: ArchiveResearch
            type: task
            agent_ref: archive-researcher
            prompt: "$.input.query"
    next: SynthesisStep

  # Step 3: Synthesise parallel results.
  - id: SynthesisStep
    type: task
    agent_ref: synthesizer
    next: RouteByConfidence
    retry_max: 3
    prompt: "$.parallel_results"

  # Step 4: Route based on synthesiser confidence.
  - id: RouteByConfidence
    type: choice
    choices:
      - condition:
          path: "$.results.synthesizer.confidence"
          op: ">="
          value: 0.85
        next: HighQualityOutput
      - condition:
          path: "$.results.synthesizer.confidence"
          op: ">="
          value: 0.55
        next: ReviewRequired
    default: LowQualityFallback

  # Step 5a: High-quality path — store result and finish.
  - id: HighQualityOutput
    type: task
    agent_ref: output-formatter
    prompt: "$.results.synthesizer"
    next: PipelineComplete

  # Step 5b: Route to human review via callback pattern.
  - id: ReviewRequired
    type: wait_for_token
    heartbeat_seconds: 86400      # 24-hour review window.
    next: PostReviewStep

  - id: PostReviewStep
    type: task
    agent_ref: review-finalizer
    prompt: "$.results.synthesizer"
    next: PipelineComplete

  # Step 5c: Low-quality fallback.
  - id: LowQualityFallback
    type: task
    agent_ref: fallback-handler
    prompt: "$.results.synthesizer"
    next: PipelineComplete

  - id: PipelineComplete
    type: succeed

memory_branching:
  enabled: true
  merge_strategy: coordinator_wins
  branch_namespace: "{sessionId}/pipeline/{stateId}"

How Workflows Map to Step Functions

The Terraform module translates the blueprint into an Amazon States Language (ASL) definition:

flowchart TD
    YAML["workflow.yaml"] --> TF["modules/workflows\nlocals.tf: yamldecode()"]
    TF --> SFN["aws_sfn_state_machine\n(one per blueprint)"]
    TF --> LogGroup["aws_cloudwatch_log_group\n/aws/stepfunctions/<id>"]
    TF --> IAM["aws_iam_role (sfn)\nbedrock-agentcore:InvokeAgentRuntime"]
    TF -->|"trigger.type = schedule"| EventRule["aws_cloudwatch_event_rule"]
    EventRule --> EventTarget["aws_cloudwatch_event_target"]
    EventTarget --> SFN
    SFN --> AgentRuntime["AgentCore Runtimes\n(from modules/agents outputs)"]

The module uses the AWS SDK integration pattern (arn:aws:states:::aws-sdk:bedrockagentcore:invokeAgentRuntime) because no native Step Functions optimised integration exists for AgentCore Runtime at this time.


Cross-Module Wiring

The workflows module receives agent Runtime ARNs from the agents module:

module "workflows" {
  source = "git::https://github.com/The-Cloud-Clockwork/tcc-aws-agent-platform.git//modules/workflows"

  workflow_dir       = "./blueprints/workflows"
  agent_runtime_arns = module.agents.runtime_arns  # map of agent_id -> Runtime ARN
  environment        = var.environment
  resource_prefix    = var.resource_prefix
  aws_region         = var.aws_region
  ssm_root_path      = var.ssm_root_path
}

See Deployment Patterns for the full three-module composition.


Schema Reference

Field Type Required Default Description
id str Yes Unique workflow identifier (kebab-case recommended)
name str Yes Human-readable display name
version str Yes Semantic version string
description str No "" Workflow description
trigger TriggerConfig No type=schedule Trigger configuration
trigger.type str No schedule schedule | event | manual
trigger.schedule str No null EventBridge cron or rate expression
trigger.timezone str No null IANA timezone for scheduled triggers
trigger.event_pattern dict No null EventBridge event pattern for event triggers
timeout_minutes int No 60 Overall workflow timeout. Must be > 0.
states list[WorkflowState] No [] State machine definition
states[].id str Yes State name (unique within the workflow)
states[].type str No task task | choice | parallel | wait | wait_for_token | succeed | fail
states[].agent_ref str No null Agent blueprint ID (task states)
states[].prompt str No null JSONPath to prompt field in execution input
states[].retry_max int No null Maximum retry attempts shorthand (>= 1)
states[].heartbeat_seconds int No null Heartbeat timeout for wait_for_token (> 0)
states[].next str No null Next state ID
states[].choices list[ChoiceRule] No null Choice rules (choice states)
states[].default str No null Default state for unmatched choices
states[].branches list[dict] No null Sub-workflow definitions (parallel states)
states[].retry list[dict] No null Full Step Functions retry policy list
states[].catch list[dict] No null Step Functions catch policy list
states[].error str No null Error code (fail states)
states[].cause str No null Error description (fail states)
memory_branching MemoryBranchConfig No null Per-state memory branch isolation