Conversation Persistence

Agenticore’s /v1/chat/completions endpoint supports sticky multi-turn sessions. The same Claude session (with KV-cache, prompt cache, and full conversation history) is reused across turns of the same logical conversation.

How It Works

Every request passes through a 4-tier conversation key resolver that infers conversation identity from the request:

Tier	Source	Mechanism
1 — Header	`X-Conversation-Id`, `X-LibreChat-Conversation-Id`, `X-OpenWebUI-Chat-Id`	Explicit, highest priority
2 — Body	`body.metadata.conversation_id` or `body.user` (UUID-shaped only)	SDK / A2A convention
3 — Content hash	`sha256(system_prompt + first_user_message)[:16]`	Zero-config fallback
4 — Ephemeral	`uuid4()`	Stateless, one-shot

The composed storage key is conv:{agent_id}:{user_hint}:{key}, scoped by agent and user to prevent cross-contamination.

First Turn vs Resume

First turn (no existing session): agenticore generates a Claude session UUID, passes --session-id <uuid> to create a persistent session, and stores the mapping in Redis + file fallback.
Subsequent turns (session exists): agenticore passes --resume <uuid> and sends only the last user message (Claude already has prior context from the JSONL).

Client Configuration

LibreChat

Add headers to your custom endpoint in librechat.yaml:

endpoints:
  custom:
    - name: "LiteLLM"
      headers:
        X-Conversation-Id: "{{conversationId}}"
        X-User-Id: "{{user}}"

Raw curl

CONV=$(uuidgen)
# Turn 1
curl -N -H "X-Conversation-Id: $CONV" \
  -d '{"model":"sonnet","stream":true,"messages":[{"role":"user","content":"my name is colt"}]}' \
  http://localhost:8200/v1/chat/completions
# Turn 2 — same conv ID, claude remembers
curl -N -H "X-Conversation-Id: $CONV" \
  -d '{"model":"sonnet","stream":true,"messages":[{"role":"user","content":"what is my name?"}]}' \
  http://localhost:8200/v1/chat/completions

Agent-to-Agent

See A2A Conventions.

No Header (Tier 3 Fallback)

If the client sends no conversation header, agenticore hashes system_prompt + first_user_message to derive a stable key. Clients that replay full messages[] history each turn (standard OpenAI-compat behavior) will hit the same session automatically.

Collision risk: two different threads with identical system prompt AND first user message will merge. Disable with AGENTICORE_CONV_HASH_FALLBACK=false.

Configuration

Env Var	Default	Description
`AGENTICORE_CONV_HASH_FALLBACK`	`true`	Enable/disable Tier 3 content-hash fallback
`AGENT_MODE_SESSION_TTL`	`86400`	TTL in seconds for session mappings in Redis

Storage

Redis: agenticore:session:conv:{agent_id}:{user}:{key} hash with TTL
File fallback: ~/.agenticore/agent_sessions.json
Claude JSONL: ~/.claude/projects/<encoded-cwd>/<session-uuid>.jsonl (on the agent pod’s PVC)

Observability

Every request logs: conv_key=... tier=... agent=... stateless=...

Filter logs by tier to see which clients are using which resolution strategy:

kubectl logs anton-agent-0 -c agenticore | grep 'tier='