TLDR: Documentation should be readily available. Plans should be written down, linked to tickets, and tickets linked to code via commit messages. This applies at all scales: global infrastructure, a single application, or a single package within a codebase.
“Why”. It’s the eternal question.
Take a minute to consider the old story about the three bricklayers. The third bricklayer is the focus of the story, the one with the most well-developed sense of meaning. I think this story is popular because deep down, we all feel a need to search for meaning. We obviously search for facts and information…the “what” and “how”, and we must answer these before we can ask/understand the “why”.
Also consider the Five Whys. Again, the focus is on the “why”…there’s no such thing as “five whats” or “five whos”.
My point with all this is that question “why” is a special one. Having the answer is incredibly valuable, but it can also be an exceptionally hard answer to get. Chesterton’s Fence feels like a corollary to this idea.
This matters for coding, because engineers jobs mostly come down to “changing code”. We want the code to keep doing all the things it was, but we want one thing to change. When an engineer is editing code before they can make any change, they need to understand two things:
- what is this code doing?
- why was it written this way?
Usually it’s easy to get an answer to #1, but it can be very difficult to get an answer to #2. It’s not always obvious why something was done a certain way. There should be documentation, but there often isn’t any, or it’s not easy to find.
I remember working on an older codebase where someone migrated from one Git repo to another, but instead of doing it in a way that preserved the commit history, they just copied the files into a new directory, and ran:
git add *; git commit -m "first commit"
Never do this. It completely removed the entire git history, and removed most of our ability to understand why things were done a certain way. We ended up moving slowly and breaking things. We also hated the code, and you know what they say about being considerate of the mental state of the people who maintain your code.
…it is possible to look at a line of code and 60 seconds later, have access to the full history of that code, all the way to the business strategy document explaining why that line of code is valuable to the company.
If you follow a good process, it will be easy for your engineers to understand the system quickly, and they’ll get more work done, with higher quality. Here’s how it works for them:
- Read a line of code. Look at surrounding comments.
- Use “git blame” to view the commit message. From here the engineer can see notes from the person who wrote the code. Usually not a ton of information, but you can understand their thought process.
- The commit message is linked to a PR. This shows the engineer all of the other code changed at the same time, plus the notes from the review process, the PR description and a link to the ticket
- The ticket explains more information, and is linked to other tickets with more information. The ticket also establishes timeframes, and you can search for other tickets worked on around the same time.
- One of the tickets (usually some kind of parent/feature ticket) will have a link to a planning document with even more information.
If your team follows the right process, it is possible to look at a line of code and 60 seconds later, have access to the full history of that code, all the way to the business strategy document explaining why that line of code is valuable to the company.
In my experience, the hardest part of this process is cultivating the habit. The actual effort of linking things takes only a few seconds. Some people like to use linters as a safeguard, and that’s useful in a large-scale organization, but I don’t think there’s a substitute for understanding the reason for this process. Put another way, it’s important to understand the “why” of the process. 😉
(Side note: Tim Berners-Lee realized the power of the hyperlink, and this is what created the modern web, Google’s PageRank, and so much more. Linking relevant information is a game-changer.)
Special consideration: infrastructure
Unfortunately not everything is code, and not everything is committed to Git. I mostly see this with configuration-instead-of-code systems, often from third-party infrastructure vendors, but sometimes internal tools as well.
Sometimes you will deal with a configuration file where you can’t include comments for context, or you have to deal with a file that can’t reasonably be version controlled so you don’t get a link to a ticket. Even worse, you might have a UI-only interface. It’s possible to automate these components, but that doesn’t always happen. Eventually you end up with 500 entries and no clue why they exist, or if they’re safe to edit/remove.
The solution I’ve found is that usually you get some type of text field, like in a firewall where you might be able to add a name to the firewall rule. Name things “inbound-for-database-TIX1234” where TIX1234 is the ID of the ticket for the work where someone can find more information.
Special consideration: hacky solutions
A long time ago, a very good engineer told me “it’s ok if there’s a mess, just document the mess”. I’ve never forgotten this advice. It’s acceptable to do hacky things under certain circumstances. I always get a laugh that a CPU is the ultimate hack: it is basically a rock (silicon) getting electrocuted.
If you have to do something unusual or hacky in the code, you should shine a spotlight on the hack. Put a big comment explaining why you’re doing something weird. This shows you know that you’re doing something unusual, and provides the context for why that unusual behavior might need to be preserved.
Also, if possible, keep the weird hack abstracted. If it’s a piece of code, put it in a library so it will be in a single place. It’s easier to remember and clean up one mess, it’s much harder if there are 100 messes.