nick comer

programmer, tinkerer, learner

Observing the 2nd law of thermodynamics in software

entropy

There are probably an innumerable ways to phrase the 2nd law of thermodynamics. For the purposes of this post, this definition should serve the point well:

The second law of thermodynamics states that the total entropy (disorder) of an isolated system can never decrease over time.

If this definition can be parsed and framed in a software development context, it can sort of lend a vague idea of how software “ages” over time. That being, software starts in an ordered state and will gradually become disordered as time goes on; and attempting to reverse that is extremely difficult and will only succeed in introducing more disorder.

Knowing the source of that “aging” (disorder) can help programmers know where to start and how to continue introducing changes to software in a way that is in keeping with the natural increase of disorder in a particular system.

If programmers strive to make software incredibly minimalist and restrictive from the beginning, and only make changes that do not attempt to fight the natural increase of disorder, that software will age gracefully and slowly over time — eventually needing replacement after a long, well-served life-span.

Whereas if attempts are made to be “clever” or to create something that is “flexible” when implementing the MVP of a piece of software, it will most likely end in the need to re-factor, which will cause an exponential increase of disorder in that software.

The following example of a bug in a piece of Java demonstrates this theory. It plainly shows how lack of defensiveness in programming can lead to the rapid aging of software.

Everything should be “private” by default.

Encapsulation is an incredibly useful tool in the never-ending fight against entropy in software. By using encapsulation, it is possible to explicitly designate certain parts of a code-base as “off-limits” and subject to change without notice.

It is contradictory to state an explicit reason for having things “private” by default. Rather it is more useful to say that parts of software that are “public” should have an explicit reason for being that way.

The following snippet shows a DB connection component that exposes its own dependencies (the underlying java.net.Socket) for all the world to see/use/misuse:

final public class DatabaseConnection {
    public Socket connection;
}

After seeing numerous bug reports about transactions being left open, the engineering team finds that other parts of the code base are directly accessing the connection. Doing things like this maybe:

public boolean saveUser(...) {
    // ...
    String query = "BEGIN; SET CONSTRAINTS user_order_uniq DEFERRED";
    OutputStream w = dbConnection.connection.getOutputStream();
    w.write(query.getBytes());
    w.flush();
}

Oh dear, that could probably be why those bugs are happening. Awesome, the source of the bug has been found and the engineering team is going to fix it by making the underlying socket private.

One problem: taking something that is public and attempting to make it private represents a move from disorder to order; which is attempting to reverse the natural progression of all systems.

In the attempt to make this move, the first step will be to find all the places in the code-base that make use of this public property and fix those so that they don’t do this anymore. If the tooling around the project is amazing, there will be some static code analysis to use; but most will end up having to grep the entire code-base. If this were a library of some sort, there would be no choice but to issue a major, non-BC release to fix this; thus not only aging the library but all the code-bases that depend on that library! And suddenly the attempt to fix this loophole has spiraled in scope and is causing a lot of entropy exhaust.

Here is where the phenomenon can be seen out in the open: it is fundamentally difficult to reduce the amount of disorder in a code-base, much like it is near impossible to reduce the amount of entropy in a physical system.

The natural way to write this would have been to treat everything as private by default. Only once there is an explicit and concrete reason for making something public would it be appropriate to do so. Going from private to public is backwards compatible and not disruptive to anything.


The above only represents a single example in a single language of this “anomaly” of software. There are sure to be countless other examples across languages/technologies/stacks to be found of disorder increasing in software over time. If this observation is presently recognized throughout in the development and maintenance of software, it will lead to decisions in the overall design of software that embrace this natural progression and gracefully age software.