The Three-Body Problem of AI Agent Instructions

The Chaos

Here's a scenario every developer using AI agents has experienced:

CLAUDE.md says: "Use TypeScript strict mode for all new files"
AGENTS.md says: "Follow the existing codebase conventions"
lessons.md says: "Avoid excessive type annotations — they slow review cycles"

The agent encounters a JavaScript file that needs modification. What does it do?

It doesn't follow any single instruction consistently. It oscillates. One run: converts the file to TypeScript. Next run: keeps it JavaScript "to match conventions." Third run: converts but skips type annotations.

This isn't a bug in the model. It's physics.

The Three-Body Problem

In celestial mechanics, the three-body problem describes three objects exerting gravitational force on each other simultaneously. Unlike the two-body problem (neat elliptical solution), the three-body problem is chaotic — small changes in initial conditions lead to wildly different outcomes.

Each instruction source "pulls" the agent's behavior in a different direction. With two sources, the agent can negotiate a stable orbit. With three or more — chaos.

Why Two-Body Works but Three-Body Doesn't

Two files: CLAUDE.md + AGENTS.md. One is "identity," the other "procedure." Clean separation → stable orbit.

Three files: Add lessons.md. Now every instruction in lessons.md exerts force on the resolution of conflicts between the other two. The triangulation has no stable solution.

Three competing instruction sources with no explicit hierarchy produce non-deterministic agent behavior.

The Evidence

The AGENTS.md paper measured this: context files "encourage behavioral changes that do not consistently lead to improvement." The Pythia paper showed performance oscillating between 1.0 and 0.0 — the agent swinging between attractors in a chaotic orbit.

The Hierarchy Solution

Physics says no general solution. Engineering says: impose a hierarchy.

This transforms the three-body problem into a cascade:

CLAUDE.md wins. Always. It's the constitution.
AGENTS.md wins over lessons.md. Procedure beats experience.
lessons.md provides suggestions, not mandates.
Conversation context is ephemeral — highest for immediate tasks, no persistence.

PID Controllers for Agents

A PID controller is the most deployed feedback system in engineering — thermostats, cruise control, industrial robots. Three components:

P (Proportional): React to current error → Current task instructions
I (Integral): Accumulate past errors → lessons.md
D (Derivative): Predict future error from rate of change → Hooks/automation

P = Current task instructions
    → "This bug exists, fix it"

I = Accumulated lessons (lessons.md)
    → "This pattern has failed 5 times"

D = Trend detection (hooks/automation)
    → "Test failures tripled this week"

Integral Windup: The Hidden Killer

PID controllers have a well-known failure: integral windup. When the I component accumulates too much history, it overshoots. The system swings past the target, then over-corrects, oscillating wildly.

This is exactly what happens with lessons.md accumulation.

Every real PID controller has anti-windup mechanisms:

Clamping: Cap the integral → Max rules per scope
Conditional integration: Only accumulate within certain range → Only add lessons for verified, scoped issues
Back-calculation: When output saturates, wind back → Prune lessons when performance drops

A lessons.md without anti-windup limits is a PID controller without integral clamping. It will oscillate into instability.

The Engineering Solution

1. Establish Constitutional Hierarchy

# In CLAUDE.md (Priority 1)
INSTRUCTION HIERARCHY:
1. This file (CLAUDE.md) is authoritative
2. AGENTS.md/SKILL.md provide procedures
3. lessons.md provides suggestions only
4. When in conflict, higher priority wins

2. Implement Anti-Windup for Lessons

# In lessons.md header
MAX_RULES_PER_SCOPE: 10
TTL_DAYS: 60
CONFIDENCE_THRESHOLD: 0.7

## Rule format:
- Scope: [backend|frontend|api|tests|all]
- Added: YYYY-MM-DD
- Confidence: [high|medium|low]
- Evidence: [link or description]
- Expires: YYYY-MM-DD

3. Clear Boundaries

CLAUDE.md: Identity. Values. Non-negotiable standards. ≤15 rules.
AGENTS.md: Project procedures. File patterns. Tech stack conventions.
SKILL.md: Task-specific procedures. Loaded on demand. Never contradict CLAUDE.md.
lessons.md: Learned corrections. Scoped. Dated. Capped. Evicted when stale.

Implementation Guide

Step 1: Audit Current State

List every instruction source your agent loads
Categorize each rule: identity, procedure, experience, or ephemeral
Identify contradictions between sources

Step 2: Establish Hierarchy

Add an explicit priority statement to CLAUDE.md
Move identity rules to CLAUDE.md, procedure to AGENTS.md, experience to lessons.md
Remove duplicates mercilessly

Step 3: Implement Anti-Windup

Add TTL to every lessons.md entry
Set hard cap: 10 rules per scope
Schedule monthly review
Track: does this rule improve or degrade performance?

Step 4: Monitor

Track task success rate before and after
Watch for oscillation (inconsistent behavior across runs)
Measure CER (from Part 1)

The three-body problem has no general solution. But a hierarchy — with anti-windup — transforms chaos into a stable control loop.

Part 3 of the Eureka Series. Previous: Why Your Agent Instructions Are Attacking Your Own Code. Next: We Need gcc for Markdown.

Get the complete hardening checklist | Subscribe to the weekly security digest

The Three-Body Problem of AI Agent Instructions

Table of Contents

Audit your agent stack in 30 minutes