Big vs. Small Methods

It seems like the common argument for this problem is that it’s a matter of taste whether methods should be small or big. I’d say that’s not entirely true. In fact, it’s not very true at all, depending on how you measure maintainability, that is. This is a question about modularity. The argument for whether methods should be big or small depends on context, but they also depend on the costs and benefits of choosing one over the other; something that should not be based on subjectivity.

How big is enough?

Choosing the size of a method shouldn’t be based on its size in the first place. It should be chosen depending on the number of things the method does – or rather, the number of concepts it represents. Unfortunately, the size has to be decided upon based on the programmer’s own judgment as well, since every situation is different. Some questions the programmer could ask him- or herself for deciding whether to make a method smaller are:

  • Reusability: Could this be reused? Could it increase the discoverability of this reusable segment?
  • Readability: Does this make the intent of the code more clear?
  • Testability: Could I benefit from testing this code in isolation?

Reusability

If it’s something that is incredibly valuable in software, it’s definitely reusability. Monolithic structures, such as a big method that is doing many things, usually are made up of intertwined activities that could very well be done in isolation from one another. However, monolithic structures hinder us from doing this in two ways. For one, it’s hard to discover the reusable logic. Once they’re discovered, they’re usually not easy to reuse anyway, since they, more often than not, bring along many other dependencies. Both external dependencies and the code surrounding them. In those cases, programmers will usually just copy the logic and paste it elsewhere, potentially introducing duplicate code. I say potentially, because it doesn’t necessarily mean the copied segment will represent the same knowledge as the original one.

Break down monolithic structures to their conceptual components and make them easy to find, so the risk of duplicate code being introduced is reduced. Of course, this isn’t going to remove duplicate code completely; communication among team members is still vital.

Testability

Big methods can be hard to test. It can also be hard to locate a bug in the method should a test (or worse, several tests) for it fail. Think about it, if you had a method which was 100 lines long and did many different things, and then suddenly a test for it fails, would you be able to quickly locate where in the method it failed? It’d probably be easier if all of the concepts in this big method were abstracted away into smaller methods, and then tested in isolation from another. That way you could decrease the amount of time locating the recently introduced bug.

Think of the big method as an assembly line where you put something at the start of it, and at the end of the assembly line you notice the result is completely wrong. Where did it go wrong, exactly? It’d be better if you could be notified as soon as something goes wrong along the assembly line. If you change how one part of the assembly line works and the corresponding test for that particular part fails, you will most likely know where to look.

Readability

This is probably the most subjective part to discuss. A lot of people I’ve met in real life and on the internet usually say that’s it’s easier for them to navigate a big method rather than a small one, since you need to jump around a lot in the code otherwise. I think that depends on how you’ve chosen your abstractions. If you find yourself frequently alternating between 3 files when reading a particular part of the code, for example, maybe those parts should’ve been together somehow. Sure, sometimes it may be a lot more jumps through the code, but you should appreciate the fact that it lets you choose what parts of the code you want to read at all, something big methods doesn’t really do. You have to study the whole thing to get an idea of what is happening. Maybe you realize after reading the big method that it wasn’t very interesting to you at all, and so you continue your search for whatever it might be. Maybe it’s a bug or a new feature you need to add, who knows.

Another common argument for big methods is that’s easier to read the code because it’s all in one place. Maybe it works for big methods you’ve written yourself, because you’re familiar with it, but what about those you’ve not written? How easy is it for you to understand the intent of the method? How easy is it for you to understand the consequences of the changes you make to it? How easy is it for you to find the place you need to change? I think this argument often comes from people who’ve worked with spaghetti code for years and gotten used to it: “It’s fine enough, I don’t have any problem working with it.” The problem is, just because you don’t see the problem, doesn’t it mean it’s not there. Once the problem becomes transparent enough, it’s clear how much time can be wasted with code that isn’t modular – or readable, for that matter.

Personally, I am willing to sacrifice having all code in one place for more readable, testable and reusable code.

Modularity

Another problem with big methods is that they’re usually hard to change. It’s also hard to reuse parts of it. A common argument I’ve heard against making things modular is: “Well, it doesn’t need to be reused right now, so why make it more modular now? Let’s just do it later.”

My response to this is twofold. For one, if you have a big method and continue making changes to it, those changes will probably align with the current monolithic structure and also reinforce it, making it harder to change. Attempting to extract parts of the method could result in bringing half of the codebase with it. No can do.

It’s better to make it modular from the get-go, and continuously retain the modular structure. After all, a modular structure is easier to reshape than a monolithic one. Of course, you should consider the costs of making something modular; it depends on the complexity added and the benefit you get from it. Chopping a big method doing many different things into smaller methods rarely is a bad idea because of the reasons I’ve discussed in this post. It’s better to structure your code in such a way that it’s made up of smaller building blocks, rather than a giant monolithic stone. Because then you get the benefit of easily changing the shape of your code, and writing valuable tests for it.

If your code is modular, there is a chance that your modules can be combined in ways you never intended, in order to fulfill new requirements in the future. And that alone should be reason enough to make code modular, since we all know that requirements are constantly changing and are hard to predict.

Modular code can also help make it easier for us to see the forest for the trees when we study a particular part of the code base, or the code base in its entirety. However, modular code alone cannot do this; the modules have to work together to reflect the problem domain as well.

Summary

All methods which are considered “big” aren’t necessarily bad. They are, however, a code smell. A code smell isn’t necessarily something bad, it’s just something in the code that smells weird, but it might just as well be a high-quality fish soup hiding in there; it causes you to raise an eyebrow. An example of a big method that is acceptable could be a method that just contains a switch-statement that maps from one value to another, for example. You shouldn’t decide whether a method is too big or not based on its size; decide whether it’s too big or not based on the number of things it does.

Remember, even if you prefer big methods because they “have all of the code in one place”, you should consider the reusability and testability of it as well. Is the cost of not having everything in one place all that bad when you weigh it against the benefits of readability, testability and reusability? Every situation is different and it’s important to consider the quality of the code from various perspectives.

An important thing to note: the concepts I’ve been talking about in this post (reusability, testability, readability) do not only apply at a low level (methods), but also at higher levels (classes, libraries and systems). The idea remains the same.