Disclaimer: ChatGPT generated document.
“Entropy” has a deep meaning in physics and information theory, and the idea has been borrowed into software engineering, systems design, and even project/organizational dynamics. I’ll unpack it from first principles, connect the dots rigorously, and show how the metaphor works in codebases, architectures, and teams.
In thermodynamics, entropy originally quantified disorder in a physical system. At a macro level:
Higher entropy → more microstates are possible → system is more disordered and harder to compress into a simple description.
Key properties:
✔ Entropy tends to increase naturally (Second Law of Thermodynamics).
✔ Reversing entropy requires energy/work.
✔ Perfect order is low-entropy, high structure.
✔ Perfect chaos is high-entropy, no structure.
This sets the metaphor we later apply in software.
Claude Shannon gave entropy a rigorous mathematical form:
Shannon entropy measures the uncertainty in a message or random variable.
Formula (for discrete variables):
H(X) = - Σ p(x) log₂ p(x)
Where:
- High entropy = higher unpredictability
- Low entropy = predictable, highly redundant structure
Examples:
- A message like:
AAAAAAAAAAAAAA→ low entropy - A ciphertext or high-quality compressed file → high entropy
This leads to:
✔ Compression = minimizing entropy by exploiting patterns
✔ Encryption = maximizing entropy to hide patterns
Software took the concept metaphorically, especially after Lehman’s Laws of Software Evolution and the influence of complexity theory.
When programmers say:
“The codebase is getting high entropy.”
They typically mean:
- disorder increases
- structure decays
- coherence and consistency fade out
- accidental complexity accumulates
- predictability drops
- cost of reasoning rises
Characteristics of high-entropy codebases:
❌ inconsistent patterns
❌ unclear ownership
❌ ad-hoc exceptions
❌ duplicated logic
❌ poor modularity
❌ unpredictable side effects
Just like in physics:
- entropy increases unless work (refactoring) is done
- the default trajectory of a codebase is decay
This connects strongly to technical debt.
In real-world projects, entropy grows due to pressures like:
- Feature velocity
- Changing requirements
- Partial understanding
- Weak abstractions
- Personnel turnover
- Temporal hacks that become permanent
- Inadequate documentation
- Accretive complexity
Entropy increases because short-term optimization trades structure for delivery.
They are related but not identical.
| Concept | Meaning |
|---|---|
| Technical Debt | past shortcuts that impose future costs |
| Entropy | general disorder in the system |
| Overlap | shortcuts often increase entropy |
But a system can have:
- Debt without entropy (deliberate shortcuts in clean modules)
- Entropy without debt (no shortcuts but poor conceptual design)
Architecture amplifies entropy because:
- interfaces ossify
- coupling increases
- invariants become implicit
- distributed ownership creates drift
High-entropy architectures exhibit:
✔ circular dependencies
✔ leaky abstractions
✔ inconsistent domain models
✔ special-case handling everywhere
Low-entropy architectures have:
✔ strong cohesion
✔ layered boundaries
✔ consistent domain language
✔ predictable invariants
Some engineers use “entropy” socially/organizationally:
Over time, teams forget intent, tribal knowledge decays, bus factor drops.
Entropy increases in:
- product decision timelines
- communication structures
- process enforcement
- onboarding paths
An organization with high entropy:
❌ loses clarity on why things are done
❌ forks its own roadmaps
❌ accumulates contradictory priorities
Entropy has a non-metaphorical, formal meaning here.
Systems require high entropy for:
✔ nonces
✔ IVs
✔ session keys
✔ random number generation
✔ salting
Low-entropy randomness leads to catastrophic failures (e.g., predictable RSA primes).
Distributed systems experience entropy in the form of:
- clock drift
- eventual inconsistency
- partial failures
- message loss/reordering
- divergent replicas
Keeping order requires work:
→ consensus protocols (Paxos, Raft) → CRDTs → anti-entropy protocols (yes, actual term!)
Anti-entropy protocols reconcile replicas by exchanging rumors of state, reducing disorder.
ML introduces another entropy vector:
✔ data drift
✔ concept drift
✔ model entropy when uncertainty grows
✔ calibration entropy (confidence vs accuracy gap)
Entropy increases unless:
✔ retraining occurs
✔ data pipelines stay clean
✔ labels stay consistent
Higher entropy makes systems:
❌ harder to reason about
❌ slower to modify
❌ fragile during extension
❌ bug-prone
❌ more expensive to maintain
Low entropy provides:
✔ predictability
✔ legibility
✔ extensibility
✔ modularity
✔ performance reasoning
✔ fewer surprises
Reducing entropy = investing usability and structure into the codebase.
Techniques include:
✔ refactoring coherently
✔ enforcing architectural invariants
✔ modularization + layering
✔ domain modeling (DDD)
✔ automated testing
✔ linters + formatters
✔ CI/CD + reproducibility
✔ documentation + ADRs
✔ interface contracts
✔ code reviews
Without maintenance:
- weeds grow
- paths disappear
- intentional structure decays
Maintenance ≠ new features, but required work.
Entropy in software captures a deep truth:
The natural state of complex systems is disorder. Order exists only because energy is continually spent to maintain it.
Every codebase—no matter how well designed—moves toward chaos without deliberate investment.
It stuck because it elegantly captures multiple forces:
- temporal drift
- complexity growth
- human forgetfulness
- knowledge loss
- compounding messiness
It also captures the economic nature of software:
- entropy grows passively
- order must be funded actively
I can expand on:
- entropy & Lehman’s laws of software evolution
- entropy vs cyclomatic complexity
- entropy vs compressibility of code
- entropy & Conway’s law (org→structure coupling)
- entropy in long-lived legacy systems
- entropy in distributed state reconciliation (anti-entropy protocols)
- entropy in version control & branching models
- how startups vs enterprise manage entropy differently
- how code review policies affect entropy
- how AI-assisted code generation changes entropy dynamics (interesting topic!)
Just say what angle you want.
Lehman’s Laws of Software Evolution + Conway’s Law — these two ideas are fundamental lenses for understanding how software evolves in the real world (not just on paper). They’re also deeply connected to the “entropy” discussion, because they explain why decay happens and why structure mirrors the organization.
These come from Meir Lehman’s empirical study of long-lived software systems (starting in the 1970s, observing OS compilers, mainframe systems, etc.). His key insight:
Real software systems don’t remain static — they evolve because the world around them changes.
He identified 8 laws, which I’ll restate with modern interpretation:
A system must continuously adapt or become progressively less useful.
Drivers:
- environment changes
- requirements shift
- users demand features
- new hardware/platforms
- new security constraints
Real-world takeaway: stasis = death.
As software evolves, its complexity increases unless work is done to reduce it.
This is where entropy comes in:
- ad-hoc changes pile up
- abstractions distort
- invariants become implicit
- coupling increases naturally
Refactoring → “anti-entropy” effort.
Software evolution processes have statistical trends and self-regulating behavior.
Modern analogy:
- velocity stabilizes
- bug/feature ratios converge
- development rhythms form
It behaves like an ecosystem, not a machine.
The amount of work done remains roughly constant over time.
Even as complexity rises, organizations rarely increase effective output proportionally.
Teams can only absorb change at a limited rate.
If change volume exceeds cognitive bandwidth → failures, architectural collapse, rewrites.
A system must be continually enhanced to remain satisfactory to users.
Emphasis: growth ≠ quality.
Sometimes features accumulate faster than structure allows → debt + entropy.
Quality declines unless active work is devoted to maintain it.
In practice:
- maintainability degrades
- test coverage rots
- architecture drifts
- documentation ages
It’s brutal but empirically observed.
Software evolution is a multi-loop feedback process.
Inputs include:
- bugs
- user feedback
- performance metrics
- failures
- market forces
- management decisions
Lehman predates:
✔ agile
✔ continuous delivery
✔ DevOps
✔ cloud-native
✔ microservices
Yet he predicted them conceptually, because all are attempts to:
manage entropy + change + feedback loops
Now the other famous law:
“Any organization that designs a system will produce a design whose structure is a copy of the organization’s communication structure.”
Origin: Mel Conway, 1968 Validated in many domains.
Your architecture mirrors your org chart.
Examples:
- 4 teams → 4 services/modules
- siloed teams → siloed components
- weak boundaries in teams → leaky abstractions in code
Communication constraints → coupling constraints.
Companies adopt microservices not because they’re ideal tech, but because:
- autonomy
- decoupled deployment
- team locality
match org incentives.
Startups with 3–6 engineers → monoliths emerge naturally.
Global teams → distributed architectures.
Popularized later in strategic architecture:
If you want a certain architecture, design your org communication around it.
Used by:
- Amazon (“two-pizza teams” → service boundaries)
- Netflix (service autonomy)
- Spotify (squads/tribes → modularity)
These two laws are deeply complementary:
Lehman explains:
Why software tends toward complexity and decay over time.
Conway explains:
Why software’s structure reflects social/economic structure.
Together:
- org entropy ↔ code entropy
- structural drift ↔ organizational drift
- architecture rot ↔ communication rot
- rewrites ↔ reorgs (yes, often correlated!)
A rewrite often coincides with:
- change in leadership
- change in product direction
- change in team topology
- change in culture
They’re enduring because they describe constraints rooted in humans, not technology.
Tooling changes rapidly; cognition and coordination do not.
Software is fundamentally:
socio-technical
Failures often originate on the social side.
In modern DevOps / cloud-native world:
✔ Conway → microservices & platform teams
✔ Lehman → continuous change + refactoring budgets
✔ Team Topologies (Skelton & Pais) formalizes org design as architecture design
There’s also a “Market Conway’s Law”:
External actors influence shape: APIs mirror ecosystems.
Combine them:
- Entropy: structure decays naturally.
- Lehman: change and growth are inevitable.
- Conway: structure is constrained by communication.
Therefore:
- decay occurs socially, not just technically
- clean architecture requires social structure
- refactoring requires organizational buy-in
- rewrites often require reorgs
- architecture cannot exceed cognitive bandwidth of teams
This explains why software engineering is harder than computer science — it involves sociology, incentives, economics, and evolution over decades.
Just ask for any angle:
- historical examples (Unix, Windows NT, Oracle DBMS, Google Borg, etc.)
- modern analogs (AWS, Kubernetes, Meta, Netflix)
- mapping microservices back to Conway
- Team Topologies theory
- how AI code generation changes Lehman dynamics
- Conway’s Law in open-source vs corporate development
- anti-entropy architecture techniques
- refactoring economics / software evolution cost curves
or I can connect this to your earlier interest in:
✔ C++ systems engineering
✔ distributed systems
✔ long-lived codebases
✔ entropy & complexity theory
✔ sustainability of large projects

https://en.wikipedia.org/wiki/Lehman%27s_laws_of_software_evolution