cleancode

Duplication isn't evil, but a massive addiction to DRY is

Jim Aho

Aug 15, 2021 • 3 min read

In software engineering people tend to think that duplication is always evil. If you have the same code in more than one place, you must refactor that so that you have only one piece of that same code or same functionality. The reason why developers are doing this is because they're following the DRY principle, which means "Don't repeat yourself" and was first introduced in the book The Pragmatic Programmer: From Journeyman to Master.

DRY is very good

In my experience the DRY principle is a very good principle. Generally speaking, you should always try to follow the DRY principle, because in plenty of cases duplication is actually hard to maintain. For example, if you fix a bug in one place, you might have to fix it in a few other places too, because of the duplication. This is time-consuming and gives no real value, so it's better to follow DRY.

But a massive addiciton to DRY is unhealthy

But there are also some cases where duplication isn't actually that evil. In fact, duplication can sometimes be very valuable and a good thing, because one thing you get with duplication is that it's very easy to evolve one part of that code, and not the other. The key here is to start talk about responsibility of the code, because if we can clearly define the responsiblity of a piece of code then that will guide us to take better decisions if a new piece of code should be "duplicated" or not.

An example - models

Let's say your building an API. A request arrives which gives birth to a request model. Inside the API, you translate that request model to a domain model which gets processed. Also, in the middle of this you make a request to towards another system (be it inside your company or outside) and this system will respond to you with data so then you normally parse the response from that other system. And you need to parse it into something (we call this integration model for the sake of this post).

Now we have three different kind of models that might at a first glance look very similiar. But if we think about it then it starts to be clear that each model has it's own purpose and responsibility. Some people tend to reuse the domain models as request models because of laziness. I've seen domain models used for integration purpose too. I mean, if they all look the same at start - why create another model for the integration if the domain model looks identical? It's a fair question, and we'll answer it in this post.

Take a look at the following models:

Here we have three different models that (at this point in time) are identical. One could think that it's unnecessary to have three different models instead of re-using one instead. But we deliberately want three different models here (will see later on why). I think this is a pretty simple but useful example of when you should relax your relation to the DRY principle. Even if your request model, domain model and integration model looks identical at start, it's worth keeping them separate.

So we can see here that duplication (of those properties) is actually a good thing, as long as the responsibility of that duplicated code is different.

So why is it important to do repeat yourself in this case?

The primary reason why is because the models do change for very different reasons.

Imagine we had this single model instead to act as request model, domain model and integration model:

What happens when our external system adds a new field Address - or better - changes the property CustomerId to IdentityId? The external system can change their contract (add/rename properties for example, this is very comon). This shouldn't of course impact your domain model, because it changes when business requirements change, not when an external system renames CustomerId to just IdentityId. Having a shared model here will be messy, whereas separate models (duplicated code if you will) will welcome the change better.

So this is a good example where if you would go down the path of sticking to one single model and re-use that model for integration, response and your domain (addiction to DRY) that will cause more harm than help you.

Summary

DRY is here to help us, and it is one of the core principles in software engineering and a really good one. Follow the DRY principle as much as possible to make your new business requirements easier to implement and maintenance of the codebase easier, but be pragmatic about your decisions. To follow DRY doesn't mean you cannot duplicate code, it just means that we should avoid duplication when we have no real reason to do so. But if we do have reasons for it, like the example above, then we must "duplicate" those properties into its own model, because they will change for different reasons.

DRY is very good

But a massive addiciton to DRY is unhealthy

An example - models

So why is it important to do repeat yourself in this case?

Summary

Sign up for more like this.