How bad is smelly code?
—Measuring the effect of code smells
By Aiko Yamashita
and Thomas Flemming
Running scientific experiments to figure out which code smells are worst for maintenance.
What to do with code that smells?
We have all been taught that our main responsibility as developers is not to write bad code. Unless you are a lonely programmer writing a few lines of Perl that you will throw away later, the most important thing is that the code you write is easy to read and understand. Maintainable code will, through the life span of the software product, often save you and your colleagues many hours of pain and frustration in front of the computer.
Creative Commons Image via Getty
However, it is not always so obvious what makes code unreadable to others, which is one reason why things like coding standards, guidelines, code patterns and programming language idioms exist. One well-known guideline is the book «Refactoring» by Martin Fowler, where he describes a list of «smells» in the code and refactoring strategies to remove them. Yet, with this invaluable resource, we are left with the challenge not only of learning the many refactoring strategies, but also determining which ones to prioritize. Surely, they cannot all be equally important! How do we know that the code refactoring we do now will pay off in the long run?
The problem is that Martin Fowler’s refactoring book does not say which code smells are critical and which are not. Fowler himself says there is no metric or indicator that can compete with human intuition. That leaves us only with our intuition and experience as developers when deciding whether to refactor something or not. This can be a nightmare. Where do you start in a system with thousands of code smells? Besides, any kind of change in the code can lead to undesired side effects. Even with good, automated tests, changes often involve high risks and can become very expensive. It would be nice to know for a fact which code smells are the most harmful and get rid of them first. Also, we would like to show management that we are not wasting our time on writing «beautiful code» for the sake of it, but that the effort we put in now will pay off later in the project.
Collecting facts about code smells
In 2009, the Norwegian Simula Research Laboratory needed to adapt an internal system to a new content management system. The project was outsourced to 6 Java developers. This project was seen as an opportunity to study closely what effect code smells has on maintainability.
During the three months the project lasted, tools were used to measure code smells, developers were interviewed every day and each developer was instructed to install a logger in the Eclipse IDE. The logger made it possible not only to measure how much time was spent modifying each file, but also the time spent searching, browsing and scrolling through code.
During the project, using an issue tracker, they registered the problems developers faced, and traced them back to the source files causing those problems. Additionally, all developers were interviewed. The data gathered contained a lot more data than what is normally available by just analyzing code repositories. Here is what they found…
Finding 1: God Classes are in fact bad!
«Everyone knows» that God Classes are bad, but to what extent? Look at this this diagram:
The Y-axis shows the amount of time spent reading and/or modifying a file during the maintenance project and the X-axis is the size of the files.
Note how the largest file (1844 lines of code – LOC) took 10 times longer time to read and modify than the great majority of files with less than 600 LOC. This file constituted a God Class. Also, note that 20,000 seconds is roughly 5 hours of work (that is many hours for one file!). We can also see a large file (roughly 1400 LOC), which did not take too much time to edit. This latter file contained a large amount of getters and setters, and no logic at all (as opposed to God Classes). This explains why developers did not spend much time on it. Large files with complex logic (i.e., God Classes) may strongly impact maintainability.
Recommendation: You should split larger files with complex logic into smaller ones. A recommended threshold value is holding your file size lower than 1000 lines of code.
Finding 2: Data Clumps are not as bad as you may think…
«Data Clumps» are collections of semantically unrelated methods and variables. Normally, files that contain data clumps contain different types of variables, often followed by a series of setters and getters. For example, the class Person in the following diagram
contains information that does not directly concern a person, and can thus be split into two classes. The researchers at Simula created a statistical model to explain whether the appearance of code smells increases the probability of developers facing problems during maintenance (remember that they recorded all the problems faced during maintenance and which files where causing them). They found that files containing data clumps had in fact lower chances of being involved in a problem during maintenance!
Recommendation: Leave Data Clumps alone, unless they contain other code smells.
Finding 3: Stick to the Interface Segregation Principle
The ISP states that no client should be forced to depend on methods it does not use. ISP splits interfaces which are very large into smaller and more specific ones so that clients will only have to know about the methods that are of interest to them.
Researchers in software engineering have come up with suggestions on how to use code metrics to identify some code smells. Some commercial tools implement them, but you could also come up with your own detection strategies and implement them in tools such as SonarQube for Java or NDepend for .NET. The detection strategy used by Simula is illustrated in the following diagram, and is based on the work by the researcher Radu Marinescu.
This class has a «Class Interface Width» (i.e. the number of public methods/attributes) of 10, «Clients of Class» is 8, and an «Average use of Interface» (i.e. the number of methods/attributes actually used by its clients divided by the number of clients) of 0.375. According to this detection strategy, this class probably violates the Interface Segregation Principle.
The analysis done concerning data clumps was also performed on files violating the ISP, and it was found that the odds of a file being problematic increases (with statistical significance) when a file violates this principle.
This can also be seen in the following diagram, which shows that the proportion of problematic files (approximately 30%) is higher in files violating the ISP than in files that do not violate this principle.
Recommendation: Splitting wide, multi-purpose interfaces can reduce the risk of maintenance problems.
Finding 4: Code smells «team up»
In the same experiment, certain code smells showed a tendency to appear together in the same file. The diagram below shows the code smells identified that often come together in the same file. The «hoarders» are essentially those files that want to hoard all the functionality of the system. The «confounders» are smells that appear in files that are often confusing. The other two groups, «wide interfaces» and «data containers», are hopefully self-explanatory.
Recommendation: If you find a certain type of smell in a file, you may also want to check if it also contains any of its «companions». A combination of code smells in a file increases risk and reduces maintainability!
As Fowler says, we should follow our instinct, experience and judgment for deciding what to refactor, but here are some guidelines based on scientific studies that can help you prioritize:
- Split large files (>1000 lines of code) containing extensive logic
- It is not critical to refactor data clumps
- Split wide, multipurpose interfaces into separate interfaces for each purpose
- Files containing certain smells may also come with more «unwanted buddies», so keep an eye on them!