The Problem
Every software team in 2026 has rigorous quality controls:
- Compilers catch type errors before runtime
- Linters enforce style and best practices
- Tests verify behavior before deployment
- CI/CD gates prevent broken code from shipping
Now look at the files that control how your AI agent behaves:
CLAUDE.md— no validationAGENTS.md— no compilationSKILL.md— no testinglessons.md— no linting
These files are configuration code. They determine agent behavior as surely as source code determines program behavior. But they get zero quality assurance.
We wouldn't ship C without
gcc. We shouldn't ship agent configs withoutmdcc.
Shannon Meets Agents
Claude Shannon's information theory gives us the framework. Every communication channel has a capacity, and useful information competes with noise.
The context window is a communication channel:
- Channel capacity: The token limit (200K, 128K)
- Signal: Instructions that improve agent output
- Noise: Redundant rules, contradictions, vague guidance, stale lessons
- Encoding: How instructions are phrased — concise vs. verbose
Shannon proved that reliable communication requires keeping noise below capacity. For agents: if instruction noise exceeds the model's processing ability, reliable behavior is impossible regardless of model quality.
The Entropy Problem
Agent configs tend to have low entropy (many tokens, little information):
- "Always make sure to carefully validate all API inputs" = 10 tokens, 3 bits of info ("validate API inputs")
- "Write clean, maintainable code" — what does "clean" mean? Zero actionable signal.
- "Follow best practices" — which ones? Zero signal.
Every low-entropy token wastes channel capacity. A 3K-token CLAUDE.md might carry only 200 bits of real information.
Signal-to-Noise Ratio for Agent Configs
SNR = Unique, actionable instruction bits / Total instruction tokens
High SNR (> 0.5): Crisp, specific instructions
Medium SNR (0.2–0.5): Some verbosity
Low SNR (< 0.2): Bloated, contradictory, or vague
Most agent configs run at SNR 0.1–0.2. Verbose, redundant, full of "motherhood statements" that carry no signal.
The Compiler Analogy
A compiler does four things agent configs desperately need:
- Parsing: Verify valid structure (headings, sections)
- Semantic Analysis: Detect contradictions, redundancies, ambiguities
- Optimization: Remove dead code (stale rules), reduce verbosity
- Code Generation: Produce optimized, high-SNR output
gcc transformed C from dangerous to reliable. mdcc would do the same for Markdown agent configs.
mdcc: The Spec
Lint Pass (Static Analysis)
$ mdcc lint CLAUDE.md
CLAUDE.md:12 WARNING Vague instruction: "write clean code"
→ Suggestion: Specify measurable criteria
CLAUDE.md:24 ERROR Contradiction with line 8
→ Line 8: "Use TypeScript strict"
→ Line 24: "Follow existing convention" (codebase has JS)
CLAUDE.md:31 WARNING Redundant with line 15
CLAUDE.md:45 INFO Low information density (0.12 bits/token)
4 issues (1 error, 2 warnings, 1 info)
SNR: 0.18 (target: > 0.5)
Compile Pass (Optimization)
$ mdcc compile CLAUDE.md --target optimized
Input: 3,241 tokens (SNR: 0.18)
Output: 891 tokens (SNR: 0.67)
Compression: 72.5%
Removed: 12 redundant rules
Merged: 5 overlapping rules
Flagged: 2 contradictions (manual resolution needed)
Test Pass (Behavioral Verification)
$ mdcc test CLAUDE.md --scenario fixtures/
Running 15 behavioral scenarios...
✓ TypeScript strict applied to new .ts files
✓ Input validation present on API endpoints
✗ FAIL: Line 24 conflicts with strict mode
✗ FAIL: Line 31 too broad (false positives)
13/15 passed (86.7%)
What mdcc Would Catch
Contradictions (Three-Body Conflicts)
Instructions that conflict. The AGENTS.md paper showed these cause oscillating behavior.
Redundancy (Entropy Waste)
Multiple rules saying the same thing. Each wastes context tokens and creates inconsistencies.
Vagueness (Zero-Signal Tokens)
"Write good code" is not a specification — it's a wish. A compiler would flag it like an untyped variable.
Scope Bleeding
Rules meant for one domain applied everywhere — the autoimmune drift from Part 2.
Staleness
Rules referencing deprecated APIs, fixed bugs, old patterns. Dead code that confuses the system.
Building Toward mdcc
mdcc doesn't exist yet. But you can implement its principles today:
Manual Lint Checklist
- Token count: CLAUDE.md under 2K tokens? If not, audit.
- Contradiction scan: Each rule — does any other conflict?
- Vagueness check: "Could a junior dev implement this unambiguously?"
- Scope check: Every rule appropriately scoped?
- Staleness check: Last validated >60 days ago? Review or remove.
- Redundancy check: Two rules saying the same thing? Merge.
Manual Test Protocol
- Create 5 representative scenarios
- Run the agent on each 3 times
- Check behavior matches instructions
- Check consistency across runs
- If inconsistent → instruction conflicts
The Future
We believe mdcc will become as essential as eslint for JavaScript. Teams that treat agent configs as first-class code will outperform those that don't.
Shannon's information theory tells us the limit: your agent can only be as reliable as its signal-to-noise ratio allows. mdcc is how you raise the signal and cut the noise.
Part 4 of the Eureka Series. Previous: The Three-Body Problem. Next: Kessler Syndrome.
Get the complete hardening checklist | Subscribe to the weekly security digest
