How to review code written by an engineer better than you

Doing code reviews is mostly about spending thoughtful time reading code. Sometimes people focus on what things should look for during a review, and there are plenty of suggested checklists out there, or you can look at other code reviews among your team and make a checklist by paying attention to the patterns. This one habit will get you through most normal code reviews.

I think it’s easy to review code of someone who is new to a language. They may not write idiomatic code, or they may misuse certain language features as they learn a new language. If you know more than them about the language, the review is easy.

It’s also mostly easy to review the structure of the code for logical arrangement — for example, are objects and data models and abstraction used in reasonable ways? Are there reasonable interfaces? Is there reasonable defensive programming and error handling? If you are thoughtful about each of these areas, the review is easy.

Whether or not the code changes as a result of the comment is not what make a comment valuable.

It’s even easy to review code for functionality…check it out and attempt to run it.

Are some engineers better at each of these things than others? Yes. Some people are experts in languages. Some people are experts in modeling. Some people are experts at writing automated tests. Different people are good at different things. Some people are good at lots of things.


Every engineer is composed of different strengths, and for ease of discussion, let’s assume there are 100 dimensions to being a good engineer. Most good engineers are probably strong in about 70 of the traits, and part of the value of a balanced team is that everyone has a different 70, and once you get a certain number of people on a team, the team has strengths in all 100 dimensions.

But what happens when you’re a normal, even above average engineer, and you’ve got to review the code of one engineer who singlehandedly has at least 95 traits out of the 100? What can you possibly tell someone who is just objectively a better engineer? Why even bother? They’re probably right anyways…


Any really good engineer knows they’re good. So do the people around them. This isn’t about ego, there’s usually just a gentle acknowledgement. A really good engineer is obvious to everyone. The unexpected side effect is that sometimes this results in the really good engineer not getting feedback on their work, because everyone has the same impression — they read the code and go “hey this passes my checklist, but of course it would…approved!”.

This is unfortunate, because I think virtually everyone who is good at something enjoys it when someone takes time to engage with their work and provide commentary, good, bad, or neutral. It’s not as much fun to get rubber stamp approvals. No one is accidentally a good engineer — if they consistently produce quality, it’s because they’re putting thought into their work.

As a result, I try to leave some type of comment every time I read a PR. Sometimes I’m reviewing code where the author has more domain experience than I do, but I’ve found some techniques that help.

One of my best techniques is that as I read through the code, I leave comments like “So in this method, you’re basically….” or “Is this because….” Essentially, I just write down observations and questions. There’s no attempt to “give feedback” per se, it’s more like I’m validating that I was able to understand their code.

This is a crucial step.

There are lots of things you can have an opinion on, and it’s better to post something than nothing. Even a “simple” observation is something that even the most junior member of the team can post.

Whether or not the code changes as a result of the comment is not what make a comment valuable. The goal should not be to only post comments about suggested changes. The goal is to have a discussion about the code. Posting a simple observation, almost “restating” their implementation in a sentence or two, can be a valuable thing.

When someone gets comments on their work, it makes the author feel good that someone’s paying attention, and it also adds an important validation step. By explaining your interpretation of the code, you’re validating that it can be maintained in the future. You’re also contributing to documentation. Years in the future when someone is trying to understand something, and they jump back to the PR, they see the discussion and can validate their own understanding. This “verbal confirmation” can be deeply valuable to you and to other people on your team.

Put another way, code written by a senior engineer shouldn’t only make sense to other senior engineers. Part of what makes a great senior engineer is that they produce solutions that are maintainable, which means that even a more junior engineer should be able to understand it.

I think this is really the core spirit of “how to give feedback to an engineer who you know is better than you”. Even if you don’t have a “critique” of something, you can add value by doing nothing more than add comments that just explain what the code is doing, or that explain your interpretation of it.


Before you start the review, think about the functionality and come up with a 60 second guess as to how you’d write the code. This will give you a starting point. You’ll have a perspective (even if it might be incomplete) and then you’ll find it easier to add comments like “why did/didn’t you do it this other way?”. This comment pattern is a gold mine for fostering a good discussion.


It’s worth your time to do reviews. At minimum, you become a better engineer by reading code. If you know a piece of code was written by someone whose skill you admire, why not take the opportunity to study it? And if you study it, why not write down what you take away from it? This is what starts a discussion, and this is where learning happens. Plus, it helps the author. It’s just good all around.


If nothing else, add comments of things you thought the author did well. If this engineer is as good as you say, compliment what you think they did well. Maybe they’ll even respond with some more “yeah I thought this would be a good idea because X” and they’ll even have some other reason you didn’t even think about, and then you learn even more. Again, the goal is to have a discussion about the code.


My overall point is that there’s basically always something you can write on a PR, no matter what kind of skill difference there is between the author and reviewer. The point of code reviews is not just finding bugs, or fixing problems. There’s always some kind of discussion that you can have to break down silos, improve understanding, and improve your individual skills as well.

Politics and people

I’ve noticed I seem to have a different take on politics and people than most people do. I think there are a few books that I’ve read which have shaped my thinking. I wasn’t necessarily trying to learn about politics when I read them, I didn’t read them in any specific order, and they’re not really political books either. The ideas in them just seem to come up a lot.

The first book is Society of the Spectacle, which is a mind-blowing book. It was written in 1967. I don’t read French, so I read the English translation. It’s the sort of book where you make it through a sentence and you genuinely have to stop and think about what you read, not just because the ideas themselves are interesting, but you constantly have to stop and think about how it was written in 1967 and you think about the past 50+ years. There’s a digested version if you want.

Society of the Spectacle practically predicts social media, Instagram, cable news, “fake news”, and more. I think it goes beyond that in predicting things we can’t see like the phenomenon of clickbait headlines, information/filter bubbles, and addictive technology like that described in Hooked (and also predicting the followup, Indistractable). I think it’s worth really considering how these things affect politics and how we talk with each other as a nation.

On the “fake news” subject, Amusing Ourselves to Death nails it. I read this in 2003, and I know this because I liked the book so much that I looked up the author, Neil Postman, only to find out that he had died the week before. It’s never cool to find out about a new band right before they break up.

If you’ve spent any amount of time thinking about fake news and the problems with entertainment-as-news, you should read Amusing Ourselves to Death. It’s similar to Society of the Spectacle, but is much easier to read.

This next one isn’t exactly a book, but more of a subject area: Semiotics. This is some of the most mind-bending stuff of all. The official explanation is that semiotics is the study of signs and symbols, but this doesn’t do it justice. The best way I can explain it is that it’s about how Sherlock Holmes sees the world. It’s almost like a book that’s not about the meaning of things, but how things could mean anything in the first place.

I think semiotics is a foundation for Society of the Spectacle and Amusing Ourselves to Death. Knowing how we create meaning for ourselves seems useful in thinking about how the media affects us. I personally bought an introductory textbook, but there could be better books out there. For me, a big part of the point is about how people interpret the same thing differently, which happens all the time in politics. It seems to require a very careful separation of opinions and facts, and semiotics seems like a way to unwind these.

Onto a different subject with Thinking, Fast and Slow. It’s a book written by psychologists that won a Nobel Prize in Economics. That’s like winning in two sports at the same time — not easy. I think it even dips into being a sociology book in a way with discussions about what the authors call WYSIATI — our brains do a bad job at remembering, and I think this is worth keeping in mind when thinking inside our own heads, as well as when dealing with other people as individuals and as groups.

Another sort-of-a-book-but-really-a-subject recommendation is systems thinking. I personally read a book called Thinking in Systems, which covered the subject from an environmental angle, but is really useful when thinking about social programs and how we should think about trying to change things.

I’ll end this with one of my favorite things Obama ever said. This is solid, level-headed advice.

“Once you’ve highlighted an issue and brought it to people’s attention and shined a spotlight, and elected officials or people who are in a position to start bringing about change are ready to sit down with you, then you can’t just keep on yelling at them,” Mr. Obama said.

“And you can’t refuse to meet because that might compromise the purity of your position,” he continued. “The value of social movements and activism is to get you at the table, get you in the room, and then to start trying to figure out how is this problem going to be solved.”

“You then have a responsibility to prepare an agenda that is achievable, that can institutionalize the changes you seek, and to engage the other side, and occasionally to take half a loaf that will advance the gains that you seek, understanding that there’s going to be more work to do, but this is what is achievable at this moment,” he said.

https://www.nytimes.com/2016/04/24/us/obama-says-movements-like-black-lives-matter-cant-just-keep-on-yelling.html

When to split a data model

I had a discussion at work today where we were adding some fields to a model, and we were talking about whether it should be split into a separate data model. This made me wonder what type of guidance there was out there in the universe.

Turns out, there’s not much. I searched around for any posts about it, and couldn’t find any. There’s lots of info about how to model data upfront, but not a lot of advice about the ongoing maintenance of a data model. So I figured I’d write my own.

I’m using “data model” to describe a single class that gets data from a matching database table, and provides some small and common amounts of processing/filtering/transformation logic for that data. You could also refactor your usage of models to separate the concerns, but many projects don’t.

There are three things to consider when deciding if you should take one data model and split it into two, but I think the ideas can be applied to other designs.

  1. Data
  2. Logic
  3. Lifecycle

Data

You want your data model to be simple and easy to understand. One model should be equivalent to one concept.

The issue is when there’s another concept that’s similar, but not the same. For example, your Store table requires a mailing address, but what about an online store? Do you add a type column, and then validate that the address, or URL is present, depending on the type? Or, does it need to be an OnlineStore vs a PhysicalStore?

You end up with a table where you have 20 columns, only some of which are required under certain circumstances, but not others.

I think that validations with lots of conditionals are a warning sign that the table might be modeling more than one thing.

Logic

Many model classes contain some amount of presentation logic. By “presentation logic,” I mean that it filters the data so that only certain attributes are returned, or that it does some type of transformation to make the data ready to use.

If you notice that you end up with substantial amounts of this logic for presenting data, you should consider if the data model can be improved. Is there some reason that you might be devoting a lot of code to filtering data out of a single table? Would it be better if it was split into a separate table?

Lifecycle

I think the lifecycle of an object matters. An extreme example, for explanation purposes, would be a table that stores a data for a TemporaryMessage and a LongLivedMessage. These two types of data are managed differently, their access is probably controlled differently, and they are purged out of the system according to different business logic.

I think this is especially nefarious because if one data model has different lifecycles, it means that every time an engineer is working with those models in new code, they need to remember that there are different types, and they need special treatment for each type. This can be avoided if you have different classes for different things.

Dry Rubs

These are translations of various recipes into ratios. I haven’t made all of these recipes, I just wanted to get a bunch of dry rub ratios all in one place.

I like to have premade batches of dry rubs, and it’s harder to make batches when the recipe uses a small measure like a tablespoon. I also find it easier to compare different recipes and flavors when they’re in ratios like this.

In general, amazingribs.com seems to be a good resource. I especially appreciate the mention that if you taste the uncooked rub, that’s not what the final flavor is like. The only way to know is to cook something with it.

As a general rule, dry rubs are mostly paprika and sugar.

My personal preference is to make rubs with less or no salt. Pre-salting meat before cooking, brining, and different size pieces of meat all need different amounts of salt. Being able to measure out flavor and salt separately gives you more control over the flavors.

Also 1 cup = 16 tablespoons. Useful for when you make batches.

Easy Spice Rubs For Fish

  • 6 parts paprika
  • 6 parts light brown sugar
  • 4 parts dried oregano
  • 3 parts garlic powder
  • 2 parts cumin
  • 1 part cayenne pepper
  • 4 parts salt

http://amazingribs.com/recipes/rubs_pastes_marinades_and_brines/meatheads_memphis_dust.html
beef & pork

  • 12 tablespoons firmly packed dark brown sugar
  • 12 tablespoons white sugar
  • 6 tablespoons American paprika
  • 3 tablespoons garlic powder
  • 2 tablespoons ground black pepper
  • 2 tablespoons ground ginger powder
  • 2 tablespoons onion powder
  • 2/3 tablespoon rosemary powder

http://www.thekitchn.com/how-to-make-dry-rub-139632
beef & pork
(granulated garlic and onion are not the same as powder, and I’m not sure what a reasonable conversion is)

  • 16 parts brown sugar
  • 4 parts paprika
  • 3 parts granulated garlic
  • 2 parts granulated onion
  • 2 parts kosher salt
  • 2 parts black pepper
  • 2 parts cumin
  • 1 part ancho or chipotle
  • 2 parts mustard powder
  • 1 part cayenne pepper

https://cooking.nytimes.com/recipes/1017417-all-purpose-dry-rub

  • 8 parts paprika (about 20% less if you use smoked paprika)
  • 4 parts kosher salt
  • 4 parts freshly ground black pepper
  • 4 parts brown sugar
  • 4 parts chile powder
  • 3 parts ground cumin
  • 2 parts ground coriander
  • 1 parts cayenne pepper, or to taste

https://www.thespruce.com/carolina-bbq-rub-335939

  • 2 parts paprika
  • 1 part salt
  • 1 part white sugar
  • 1 part brown sugar
  • 1 part ground cumin (cumin powder)
  • 1 part chili powder
  • 1 part freshly ground black pepper

https://www.thespruce.com/best-odds-pulled-pork-rub-335885

  • 6 parts brown sugar
  • 6 parts paprika
  • 4.5 parts salt
  • 3 parts black pepper
  • 1.5 parts cayenne
  • 1 part dry mustard

https://www.thespruce.com/memphis-style-rib-rub-recipe-335912

  • 4 parts paprika
  • 2 parts salt
  • 2 parts onion powder
  • 2 parts fresh ground black pepper
  • 1 part cayenne

https://www.thespruce.com/magic-dust-spice-rub-335886

  • 4 parts paprika
  • 2 parts kosher salt, finely ground
  • 2 parts sugar
  • 1 part mustard powder
  • 2 parts chili powder
  • 2 parts ground cumin
  • 1 part ground black pepper
  • 2 parts granulated garlic
  • 1 part cayenne

Testing Wisdom

I get paid for code that works, not for tests, so my philosophy is to test as little as possible to reach a given level of confidence (I suspect this level of confidence is high compared to industry standards, but that could just be hubris). If I don’t typically make a kind of mistake (like setting the wrong variables in a constructor), I don’t test for it. I do tend to make sense of test errors, so I’m extra careful when I have logic with complicated conditionals. When coding on a team, I modify my strategy to carefully test code that we, collectively, tend to get wrong.

Different people will have different testing strategies based on this philosophy, but that seems reasonable to me given the immature state of understanding of how tests can best fit into the inner loop of coding. Ten or twenty years from now we’ll likely have a more universal theory of which tests to write, which tests not to write, and how to tell the difference. In the meantime, experimentation seems in order.

Kent Beck on unit test coverage, via Stack Overflow

I like this. Very straightforward, no complexity. “I write unit tests for things that are complicated or might break”, which is probably a sane strategy for all levels of testing — look for the level at which you’ve got enough complexity below, where it’s easy for something to break.

Biden on judgement

When the subject of Trump came up aboard Air Force Two, Biden referred to a well-worn story about how, as a freshman senator, he saw Jesse Helms, the archconservative North Carolina Republican, ripping into a piece of disabilities legislation. Biden was furious about it and began attacking Helms to Mike Mansfield, the Democratic Senate majority leader. Puffing on his pipe, Mansfield asked Biden if he knew that Helms and his wife had adopted a disabled 9-year-old boy no one else would take. “Question a man’s judgment, not his motives,” Mansfield instructed.

I wish to hell I’d just kept saying the exact same thing – Joe Biden

Dependency Inversion Principle

Dependency Inversion is one of the five SOLID OO principles that’s become so popular in recent years (sometimes referred to as DIP). My opinion is that it’s a highly valuable concept, and is not well-named. “Dependency inversion” doesn’t mean a great deal on its own, and a lot of attempts to explain it tend to get very heady.

(Speaking of unfortunate naming, “hexagonal architecture” is another concept that I find poorly named. Ironic that the concept is about using abstraction to avoid unnecessary coupling, yet the name itself couples the idea to the number 6, which is totally unrelated to the idea. Fortunately, it’s slowly being renamed as “ports and adapters” in most discussions, which is a much better description.)

My personal preference is towards explanations that are more intuitive, and I’d like to put my two cents in for Dependency Inversion.

To me, a way of describing Dependency Inversion is to use classes to separate the features of the application (the parts that a user might use) from the technology that makes the feature work.

For example, let’s say you have a feature where a user can sign themselves up, and once the user’s information is saved to the database, you have some more steps that need to take place. You could make an interactor called

CompleteUserSetup

and it will handle the actions necessary (instead of using a callback).

Let’s say one of the things it needs to do is send an email welcoming the user. Let’s also pretend that you use SendGrid to manage emails. That would mean we could end up with something like

class CompleteUserSetup
  def self.perform
    SendGridClient.email "subjectline", "body of email"
  end
end

CompleteUserSetup.perform

Will this run? Absolutely. Is it a good example of code that meets the criteria for Dependency Inversion? No.

What’s happening here is that the feature (user setup) is directly mentioning the technology that implements it (SendGrid). It literally has the name of the tool in the code that is defining the feature.

Literally Rob Lowe

Since there are probably lots of places in the code that send email, if you ever need to switch away from SendGrid, you need to change all those places. Usually it’s not as simple as find-replacing all the instances of “SendGrid” from the code, so a better way is to abstract all of the email interactions into your own class.

class CompleteUserSetup
  def self.perform
    OurEmailClient.send_welcome_email
  end
end

class OurEmailClient
  def self.send_welcome_email
    SendGridClient.email "subjectline", "body of email"
  end
end

CompleteUserSetup.perform

Now the feature doesn’t mention the implementation technology. If you kept this pattern going, “SendGrid” would only ever appear in your email client class, and this means that the technology would be decoupled from the features. This is an Adapter pattern, which is basically the first half of Dependency Inversion. (IMHO this is the most common implementation of DI, and this makes testing really easy)

To close the loop of “Dependency Inversion” is to actually pass the client an instance of an email client so it can use more than one.

class CompleteUserSetup
  def self.perform(client)
    client.send_welcome_email
  end
end

class OurEmailClient
  def send_welcome_email
    SendGridClient.email 'subjectline', 'body of email'
  end
end

class OurOtherEmailClient
  def send_welcome_email
    MailChimpClient.email 'subject, 'body of email'
  end
end

email_client = OurEmailClient.new
CompleteUserSetup.perform(email_client)

# or

email_client = OurOtherEmailClient.new
CompleteUserSetup.perform(email_client)

Now we’re actually passing the dependency into the location where it’s needed. This means the CompleteUserSetup interactor is totally decoupled from which messaging system it will use.

The reason this matters is that now we can choose any type of email provider we currently support, and we can also add new types of email providers that we didn’t previously use.

To be fair, in Ruby, this isn’t quite the same as in Java. Ruby will allow duck typing, which means that we don’t have to write an Interface. Also, Ruby doesn’t support Interfaces at all, so that aspect of this is missing. Still — the principle behind decoupling does make it very easy to write readable code, and testable code, and I always love that.

Highs and Lows

I was talking with an engineering manager the other day who told me one key technique he uses for his teams. I thought it was one of those simple-yet-powerful techniques that I love so much.

The simplicity is this — when he does his weekly checkin with his reports, he asks them “in the past week, not just at work but in your life as a whole, what was your high and what was your low?”

I think this is a great approach to management. It’s highly touchy-feely, but I think that’s an important part of managing others. If someone is having a good time in their personal life, it can and will have an effect on their work performance. Likewise, if someone is having a bad time in their personal life, it can and will affect their work.

I think it’s worth being mindful that the effect might not be what we intuitively expect. Someone who is having bad things happen in their personal life might react by being distracted at work, or they might use work as a means to distract themselves from the unpleasantness and focus on it strongly. Or, perhaps there might be another reaction. There’s no way to know without discussing it and paying attention.

(To be clear, I’m not suggesting that people should use a dark pattern to drive their employees by offering work as a distraction from things that might be difficult at home, only saying that the connection between personal and work does exist, and is unique for each person.)

By understanding the person as a unique individual, and as a whole, a great manager can work with the natural rhythms of the lives that their team members live. I think this maximizes happiness and results, and minimizes mistakes and turnover, which are all desirable.