Anemic vs. Rich Domain Model

Yet another controversial topic in the world of software development. Should you choose one over the other? But hold your horses for a minute: what is an anemic domain model? And what is a rich domain model? If you’ve not stumbled upon those terms before, don’t worry; I will briefly explain what the two types of domain models are. I will also explain why we really shouldn’t see them as two separate types of domain models, but rather to what degree a specific domain model is anemic or rich, and that a combination of both is most realistic (although leaning towards a rich domain model is preferable.)

Anemic Domain Models

The basic idea of an anemic domain model is simple: data and behavior is separated from each other. This is typical for procedural languages such as C, where you’d have a struct containing data and then a bunch of functions operating on those structures. In the context of domain models, this means that the behavior of domain objects have been decoupled from their own data. Most of the operations performed on the domain objects would make up a domain service layer in this case, wrapped around the domain model.

Rich Domain Models

Rich domain models are the exact opposite: data and behavior is coupled to each other. This is typical for object-oriented programming, and as it stands, also the point of object-oriented programming. In the context of domain models, this means that domain objects contain all behavior relevant to them, including validation logic. In other words, domain objects encapsulate data and behavior. The idea of encapsulation is important here however, since it is what separates an anemic domain model from a rich one.

What is Encapsulation?

As we all have been told, encapsulation is the idea of coupling data and behavior while hiding internal data from the outside world. But there’s also another important part to encapsulation, and that is data integrity; it should be hard to misuse any one object and put it in an invalid state, causing a bug to appear. So with a rich domain model, encapsulation is an integral part of the domain model, while for an anemic domain model it is not. This is important, because this is the main difference that sets rich and anemic domain models apart.

Another concept that is crucial to understanding encapsulation is the idea of invariants. Any one class often have invariants, a set of rules which every instance of that class must fulfil throughout their entire lifetime. If they don’t – well, you’ve got a bug on your hands. The goal of encapsulation is to make sure those invariants are not broken, and so you’d do this by encapsulating validation logic in any one domain object. As a quick example, imagine you’re developing a game with a Character class with two fields: current health and max health. An invariant in this case could be that the current health must never exceed the character’s maximum health. Another invariant could be that the current and maximum health must never be negative.

The Problem with Anemic Domain Models

If you’ve paid attention, you may remember me saying in the beginning that we should prefer to lean towards rich domain models, and I’m about to tell you why. Imagine you have an anemic domain model,
where every domain object contains no logic at all; simple getters and setters, that is all. Now also imagine that there’s a thick domain service layer operating on those domain objects, which includes validation logic and other common operations performed on any one domain object. When you, as a developer, want to use a specific domain object for any purpose, you take a look at it in the code and the following questions comes to your mind:

What are the invariants of this class? How can I make sure I don’t break them?
What can you do with this domain object? How are the various fields related
to each other?

Since you don’t know the invariants, you’d have to do a bit of detective work and look through the codebase. Perhaps there’s validation logic scattered here and there, and so you duplicate this validation logic while mutating the state of the domain object. Then there’s the risk of making mistakes. Whether we like it or not, even though we may be aware of some invariant, we can still make a mistake and forget to validate the object’s state. And trust me, mistakes will happen.

You also don’t really know how the fields are related to each other. What is the intention of those data members? What are they supposed to hold? Is there any combination of values that is forbidden (which is governed by invariants)? What can I do with this object? And so you may look through the codebase to get an understanding of what you can do with it. Maybe you implement your own algorithm for performing some operation on the domain object, while being ignorant of the fact that the exact same operation has been implemented elsewhere. And so duplicate code is introduced. But how could you know? It’s nigh impossible keeping track of duplicate code scattered all over the codebase.

In short, anemic models increase the risk of introducing duplicate code and bugs while also lowering the transparency of what you can do with a domain object. Sure, you could create a convention for your team and tell everyone to place domain objects and the services operating on those in the same folder, and to never mutate the domain object’s state except through those services. Or, you know, just place the behavior in the domain object so there’s no need for any conventions that could potentially introduce additional overhead in the team.

In even shorter terms: it’s easy to misuse an anemic domain model.

The Problem with Rich Domain Models

Taken to its extreme, a rich domain model has its disadvantages as well. When placing ill-fit behavior in a domain object, unnecessary coupling and complexity tends to occur. Sometimes a particular operation just doesn’t fit in a domain object. For example, should a domain object be able to answer the question whether it exists on a remote server? Or if it exists in persistence at all? Or maybe an operation that involves several domain objects? Would you want a Book object to be coupled to a LibraryService?

The main reason to avoid placing behavior that involves several domain objects in any one object is that it will muddle the model (i.e the model will lose its connection to the domain) as well as increasing the domain object’s complexity, making it harder to maintain, understand and test. Also, all those domain objects that may be involved in this operation will probably bring their own dependencies along with them, dependencies that aren’t interesting in every context where this particular domain object is used. Domain objects should only contain behavior that is inherent to the single domain concept it represents (this is basically a re-wording of the single responsibility principle, SRP).

So – which one is the best?

It’s important to note that any one domain model isn’t either anemic or rich; rather,
any one domain model is more or less anemic/rich. It’s not a choice of whether to
go with one over the other. If we take either one of them to their extreme, you may
find that the domain model will limit you rather than the opposite (as is typical for
applying dogmatic rules). You should put behavior in a domain object if that behavior
is inherent for that domain object. If you have any behavior that doesn’t fit in
any domain object however, then you should use a domain service. And that is fine.
The real trick is deciding whether to put something in a domain object or not; making
the wrong choice can be costly, after all. But in general, if you have some service code that
is mutating only a single domain object, then that code could potentially be moved to the domain object itself.