The Three-Body Problem of AI Agent Instructions
analysis#eureka#three-body problem#instruction hierarchy

The Three-Body Problem of AI Agent Instructions

In physics, the three-body problem has no general solution. In agent architecture, three instruction sources create the same kind of chaotic orbit. Here's the physics and the fix.

February 25, 202610 min read
Share

Audit your agent stack in 30 minutes

Get the free 10-point hardening checklist. Copy-paste configs for Docker, Caddy, Nginx, and UFW included.

Get the Free Checklist →

The Chaos

Here's a scenario every developer using AI agents has experienced:

  • CLAUDE.md says: "Use TypeScript strict mode for all new files"
  • AGENTS.md says: "Follow the existing codebase conventions"
  • lessons.md says: "Avoid excessive type annotations — they slow review cycles"

The agent encounters a JavaScript file that needs modification. What does it do?

It doesn't follow any single instruction consistently. It oscillates. One run: converts the file to TypeScript. Next run: keeps it JavaScript "to match conventions." Third run: converts but skips type annotations.

This isn't a bug in the model. It's physics.

The Three-Body Problem

In celestial mechanics, the three-body problem describes three objects exerting gravitational force on each other simultaneously. Unlike the two-body problem (neat elliptical solution), the three-body problem is chaotic — small changes in initial conditions lead to wildly different outcomes.

flowchart LR C["CLAUDE.md"] <--->|"pull"| A["AGENTS.md"] A <--->|"pull"| L["lessons.md"] C <--->|"pull"| L C -.- G["⚠️ Mutual gravitational pull = instruction interference"] A -.- G L -.- G

Each instruction source "pulls" the agent's behavior in a different direction. With two sources, the agent can negotiate a stable orbit. With three or more — chaos.

Why Two-Body Works but Three-Body Doesn't

Two files: CLAUDE.md + AGENTS.md. One is "identity," the other "procedure." Clean separation → stable orbit.

Three files: Add lessons.md. Now every instruction in lessons.md exerts force on the resolution of conflicts between the other two. The triangulation has no stable solution.

Three competing instruction sources with no explicit hierarchy produce non-deterministic agent behavior.

The Evidence

The AGENTS.md paper measured this: context files "encourage behavioral changes that do not consistently lead to improvement." The Pythia paper showed performance oscillating between 1.0 and 0.0 — the agent swinging between attractors in a chaotic orbit.

The Hierarchy Solution

Physics says no general solution. Engineering says: impose a hierarchy.

flowchart TD P1["🏛️ Priority 1 — Constitutional
CLAUDE.md"] -->|overrides| P2["📋 Priority 2 — Procedural
AGENTS.md / SKILL.md"] P2 -->|overrides| P3["🧠 Priority 3 — Experiential
lessons.md"] P3 -->|overrides| P4["⏳ Priority 4 — Ephemeral
todo.md / conversation"]

This transforms the three-body problem into a cascade:

  1. CLAUDE.md wins. Always. It's the constitution.
  2. AGENTS.md wins over lessons.md. Procedure beats experience.
  3. lessons.md provides suggestions, not mandates.
  4. Conversation context is ephemeral — highest for immediate tasks, no persistence.

PID Controllers for Agents

A PID controller is the most deployed feedback system in engineering — thermostats, cruise control, industrial robots. Three components:

  • P (Proportional): React to current error → Current task instructions
  • I (Integral): Accumulate past errors → lessons.md
  • D (Derivative): Predict future error from rate of change → Hooks/automation
P = Current task instructions
    → "This bug exists, fix it"

I = Accumulated lessons (lessons.md)
    → "This pattern has failed 5 times"

D = Trend detection (hooks/automation)
    → "Test failures tripled this week"

Integral Windup: The Hidden Killer

PID controllers have a well-known failure: integral windup. When the I component accumulates too much history, it overshoots. The system swings past the target, then over-corrects, oscillating wildly.

This is exactly what happens with lessons.md accumulation.

Every real PID controller has anti-windup mechanisms:

  • Clamping: Cap the integral → Max rules per scope
  • Conditional integration: Only accumulate within certain range → Only add lessons for verified, scoped issues
  • Back-calculation: When output saturates, wind back → Prune lessons when performance drops

A lessons.md without anti-windup limits is a PID controller without integral clamping. It will oscillate into instability.

The Engineering Solution

1. Establish Constitutional Hierarchy

# In CLAUDE.md (Priority 1)
INSTRUCTION HIERARCHY:
1. This file (CLAUDE.md) is authoritative
2. AGENTS.md/SKILL.md provide procedures
3. lessons.md provides suggestions only
4. When in conflict, higher priority wins

2. Implement Anti-Windup for Lessons

# In lessons.md header
MAX_RULES_PER_SCOPE: 10
TTL_DAYS: 60
CONFIDENCE_THRESHOLD: 0.7

## Rule format:
- Scope: [backend|frontend|api|tests|all]
- Added: YYYY-MM-DD
- Confidence: [high|medium|low]
- Evidence: [link or description]
- Expires: YYYY-MM-DD

3. Clear Boundaries

  • CLAUDE.md: Identity. Values. Non-negotiable standards. ≤15 rules.
  • AGENTS.md: Project procedures. File patterns. Tech stack conventions.
  • SKILL.md: Task-specific procedures. Loaded on demand. Never contradict CLAUDE.md.
  • lessons.md: Learned corrections. Scoped. Dated. Capped. Evicted when stale.

Implementation Guide

Step 1: Audit Current State

  1. List every instruction source your agent loads
  2. Categorize each rule: identity, procedure, experience, or ephemeral
  3. Identify contradictions between sources

Step 2: Establish Hierarchy

  1. Add an explicit priority statement to CLAUDE.md
  2. Move identity rules to CLAUDE.md, procedure to AGENTS.md, experience to lessons.md
  3. Remove duplicates mercilessly

Step 3: Implement Anti-Windup

  1. Add TTL to every lessons.md entry
  2. Set hard cap: 10 rules per scope
  3. Schedule monthly review
  4. Track: does this rule improve or degrade performance?

Step 4: Monitor

  1. Track task success rate before and after
  2. Watch for oscillation (inconsistent behavior across runs)
  3. Measure CER (from Part 1)

The three-body problem has no general solution. But a hierarchy — with anti-windup — transforms chaos into a stable control loop.

Part 3 of the Eureka Series. Previous: Why Your Agent Instructions Are Attacking Your Own Code. Next: We Need gcc for Markdown.

Get the complete hardening checklist | Subscribe to the weekly security digest

🛡️

Deploy Agentic AI Without Leaking Secrets

Join 300+ security teams getting weekly hardening guides, threat alerts, and copy-paste fixes for MCP/agent deployments.

Subscribe Free →

10-point checklist • Caddy/Nginx configs • Docker hardening • Weekly digest

#eureka#three-body problem#instruction hierarchy#PID controller#agent architecture#agentic AI#MCP

Never Miss a Security Update

Free weekly digest: new threats, tool reviews, and hardening guides for agentic AI teams.

Subscribe Free →
Share

Free: 10-Point Agent Hardening Checklist

Get It Now →