Conversation Persistence

Agenticore’s /v1/chat/completions endpoint supports sticky multi-turn sessions. The same Claude session (with KV-cache, prompt cache, and full conversation history) is reused across turns of the same logical conversation.

How It Works

Every request passes through a 4-tier conversation key resolver that infers conversation identity from the request:

Tier Source Mechanism
1 — Header X-Conversation-Id, X-LibreChat-Conversation-Id, X-OpenWebUI-Chat-Id Explicit, highest priority
2 — Body body.metadata.conversation_id or body.user (UUID-shaped only) SDK / A2A convention
3 — Content hash sha256(system_prompt + first_user_message)[:16] Zero-config fallback
4 — Ephemeral uuid4() Stateless, one-shot

The composed storage key is conv:{agent_id}:{user_hint}:{key}, scoped by agent and user to prevent cross-contamination.

First Turn vs Resume

  • First turn (no existing session): agenticore generates a Claude session UUID, passes --session-id <uuid> to create a persistent session, and stores the mapping in Redis + file fallback.
  • Subsequent turns (session exists): agenticore passes --resume <uuid> and sends only the last user message (Claude already has prior context from the JSONL).

Client Configuration

LibreChat

Add headers to your custom endpoint in librechat.yaml:

endpoints:
  custom:
    - name: "LiteLLM"
      headers:
        X-Conversation-Id: "{{conversationId}}"
        X-User-Id: "{{user}}"

Raw curl

CONV=$(uuidgen)
# Turn 1
curl -N -H "X-Conversation-Id: $CONV" \
  -d '{"model":"sonnet","stream":true,"messages":[{"role":"user","content":"my name is colt"}]}' \
  http://localhost:8200/v1/chat/completions
# Turn 2 — same conv ID, claude remembers
curl -N -H "X-Conversation-Id: $CONV" \
  -d '{"model":"sonnet","stream":true,"messages":[{"role":"user","content":"what is my name?"}]}' \
  http://localhost:8200/v1/chat/completions

Agent-to-Agent

See A2A Conventions.

No Header (Tier 3 Fallback)

If the client sends no conversation header, agenticore hashes system_prompt + first_user_message to derive a stable key. Clients that replay full messages[] history each turn (standard OpenAI-compat behavior) will hit the same session automatically.

Collision risk: two different threads with identical system prompt AND first user message will merge. Disable with AGENTICORE_CONV_HASH_FALLBACK=false.

Configuration

Env Var Default Description
AGENTICORE_CONV_HASH_FALLBACK true Enable/disable Tier 3 content-hash fallback
AGENT_MODE_SESSION_TTL 86400 TTL in seconds for session mappings in Redis

Storage

  • Redis: agenticore:session:conv:{agent_id}:{user}:{key} hash with TTL
  • File fallback: ~/.agenticore/agent_sessions.json
  • Claude JSONL: ~/.claude/projects/<encoded-cwd>/<session-uuid>.jsonl (on the agent pod’s PVC)

Observability

Every request logs: conv_key=... tier=... agent=... stateless=...

Filter logs by tier to see which clients are using which resolution strategy:

kubectl logs anton-agent-0 -c agenticore | grep 'tier='