Principles misapplied: Is WET really that bad?
The moment someone decides they want to become a software developer, they will encounter the acronyms DRY and WET. Either your teacher, a friend, or some blog post will tell you to "Don't Repeat Yourself" and that it's bad to "Write Everything Twice". No wonder we developers like abstractions so much. But is it really that bad to have your codebase be a little more WET than what you are taught?
What is DRY and WET again?
"Don't Repeat Yourself" and "Write Everything Twice" are essentially the same principle. One instructs you what to do, while the other tells you what not to.
The principle states that we should not have duplicate code in our software. Duplication means that when we need to make a change, we have to make it in all the duplicated code, thus increasing the amount of work and the cognitive load.
Moreover, humans are forgetful. So it is only a matter of time before someone forgets to propagate the change to all duplicated code, resulting in unexpected errors.
This effect is further amplified when we factor in unit tests. Duplicate code means duplicate tests, thus increasing maintenance costs even more.
Should you strive for a completely DRY codebase
After reading the explanation above you might conclude that 100% DRY code would be the best. But alas, the sweet spot is found somewhere between DRY and WET code.
Though many developer might be aghast that I even suggest to write some stuff twice, let’s look at what happens when we take the DRY principle too far.
Given the following example
_23// in StoreService_23public string GetStorePrice(decimal price)_23{_23 return "€" + price.ToString();_23}_23_23// in PurchasingService_23public string GetWholesalePrice(decimal price)_23{_23 return "€" + price.ToString();_23}_23_23// in FinanceService_23public string GetFullPrice(decimal price)_23{_23 return "€" + price.ToString();_23}_23_23// in SomeOtherService_23public string GetSomeOtherPrice(decimal price)_23{_23 return "€" + price.ToString();_23}
You might rightfully decide that this is duplicated code and should be refactored. So you create a PriceTransformer
class with a GetDisplayPrice
method and call it a day:
_10public static class PriceTransformer_10{_10 public static string GetDisplayPrice(decimal price)_10 {_10 return "€" + price.ToString();_10 }_10}
But now the fun part begins. The next day someone from the store department complains that the store is displaying more than two decimals. You decide to simply fix this by rounding the input to two decimal places.
However, now the purchasing department is complaining. Those decimals might not be important for the store, but when purchasing bulk items they do influence the final price.
You solve this by adding a new parameter called shouldRound
and end up with
_11public static class PriceTransformer_11{_11 public static string GetDisplayPrice(decimal price, bool shouldRound)_11 {_11 if(shouldRound)_11 {_11 price = Math.Round(price, 2);_11 }_11 return "€" + price.ToString();_11 }_11}
Satisfied, you close down for the day and all seems to be well.
But new feature requests come in over time and more and more logic sneaks into your beautiful DRY code:
- The store department wants us to print
,-
when the price is a full number - Some other department needs different currency symbols
- The finance department now wants us to include VAT
After a couple of more sprints, your clean abstraction turned into a huge amalgamation of if-statements and switch statements that nobody wants to touch anymore. Your team's velocity plummets, the code quality reaches all-time lows, and a colleague hands you a copy of Clean Code subtly opened to the chapter about the Single-Responsibility principle.
While the above example is somewhat extreme, this does happen all the time. You reference an already existing service because it does exactly what you need or you reuse a query that someone already wrote.
This is where spaghetti code happens; you're removing the wrong duplication.
Incidental duplication should stick around
The examples above are what we call incidental duplication.
At first glance, the formatting of a product price across different services may look very similar or even identical. However, upon closer inspection, you will find that the code represents different behaviors in your application.
Incidental duplication does not duplicate behavior, domain knowledge or other concepts, yet still looks (exactly) the same.
When refactoring incidental duplication, you get the opposite of what the DRY principle tries to achieve. Your code becomes difficult to understand and daunting to change. You end up with spaghetti code because suddenly different domains are coupled through shared code, even when they do not share any real-world similarities.
Identifying incidental duplication
Don't expect your new colleague to perfectly differentiate between incidental duplication and actual duplication. Identifying incidental duplication requires some domain knowledge, as you have to determine whether the code represents the same business logic.
The question I often ask myself is: "Will these lines of code change for the same reason?" The moment I can find a reason to answer "No", I know I'm dealing with incidental duplication. The currency formatter above is a perfect example of this. Different departments will have different requirements over time and thus the code will change for different reasons.
Some other questions you might ask yourself are:
- "Does this behavior belong to the same business domain?"
- "Is this part of the same feature I'm developing?"
Whenever you have to answer "No", you're dealing with incidental duplication.
A mnemonic of three
I often keep this quote in mind when I encounter suspicious duplication: You might have already encountered it as the Rule of Three. This rule is a good guideline to prevent identifying incidental duplication as actual duplication.
By waiting for a third duplication to occur, you give yourself the chance to see whether the represented concepts really are the same. You will also have more information about the requirements, similarities and differences, allowing you to create a better abstraction instead.
Even though the rule says that you should refactor the moment you write something for a third time, I would suggest waiting until you are absolutely certain you have all the information to make the right decisions. Sometimes, you will have to wait for a fourth or fifth duplication before you know the best course of action.
A note of caution
As developers, we learn plenty of rules, heuristics, guidelines, and principles. We discussed how harmful the DRY principle can be if it is over-applied. But along the way, I introduced yet another rule, the Rule of Three.
As with the DRY principle, you shouldn't take this rule too seriously. All these rules are just different ways to reason about the decisions we have to make. In the end, you, the developer, will have to pick the best possible solution.
If you take away only one thing from this article, let it be this: while best practices are well-established for a reason, don't just follow them blindly. Use them when it makes sense.
Finally
You will find yourself in situations where you aren’t sure whether something is incidental or actual duplication. In that case I advice: just push the duplication and refactor it later or await the feedback in the pull request. I personally would always prefer one simple duplication too many than a bad abstraction.
Also, if you find yourself (like me) calling it accidental duplication instead of incidental, rest assured that it means virtually the same. Mariam Webster tells us that:
Accidental and incidental can both mean "something happening by chance," but usage suggests that "accidental" also implies an element of carelessness or inattention while "incidental" implies the occurrence would have happened with or without attention or care