Saturday 8 March 2014

Book: Working Effectively with Legacy Code

Every professional developer have to deal with legacy code in the course of his career.

The book Working Effectively with Legacy Code written by Michael Feather is considered a must read and I really recommend it. This is a summary of the book.
Michael Feather definition of Legacy Code:
Legacy code is simply code without tests.
The goal of every competent software developer is to create designs that tolerate change! This is why it is critical to learn how to confidently make changes in any code base.

The main problem is to make changes while preserving existing behaviour.
Preserving existing behaviour is one of the largest challenges in software development.
The three main questions to ask are:
  1. What changes do we have to make?
  2. How will we know that we've done them correctly?
  3. How will we know that we haven't broken anything?
The main approach is to use tests as a way to detect changes.

But, when there are no tests, we end up in the:
Legacy Code Dilemma
When we change code, we should have tests in place. To put tests in place, we often have to change code.
Sometimes, it is possible to make changes without getting existing classes under tests:
  • Sprout Method: develop a new method and call it from the existing code.
  • Sprout Class: develop a new class and use it from the existing code.
  • Wrap Method: develop a new method that wrap an existing method.
  • Wrap Class: create a new class that wrap an existing class (the Decorator Pattern).
A technique like TDD is perfect in this context and only the new method or class will be tested.

However, when you need to touch existing code, it is necessary to do very conservative refactoring to get tests in place first.

The goal is to change as little code as possible to get tests in place.

The steps to follow are:
  1. Identify change points
  2. Break dependencies
  3. Write tests
  4. Make changes
  5. Refactor
Identify Change Points

Reading through code and playing with the system helps you to identify the change points. It often pays to draw pictures and make notes to improve your understanding.

Scratch Refactoring is a useful technique for learning. Grab the code and start applying refactoring without tests and at the end throw away all your changes. As a result, your knowledge about the code base will be increased.

Object-Oriented Reengineering Patterns is a recommended book for learning how to read and understand large code bases and build a big picture of a project. Once you identified the areas of the code that you need to change, it is important to answer the following question:

If I make some changes here, where I can see the effects?

Answering this questions is very important because it helps to identify which code you should be cover with tests in order to create a good safety net that gives you the confidence you need for subsequent changes.
It is important to know what can be affected by the changes we are making!
Effect Sketching is a useful technique that can be used to answer this question. The idea is to create a little graph that shows the effect of your changes. Navigation tools like ReSharper can help you identify all usages of a particular piece of code. However, if your class has a superclass or subclasses pay attention because there might be other clients that you haven't considered.



A Pinch Point is a narrowing in an effect sketch, a place where it is possible to write tests to cover a wide set of changes. If you can find a pinch point in a design, it can make your work a lot easier. It is a place where tests against a couple of methods can detect changes in many methods. Writing tests at pinch points is an ideal way to start some invasive work in part of a program.

The tests that we are going to write are called Characterization Tests. What the system does is more important than what it is supposed to do. You are not fixing bugs at this stage! The goal is to capture what the system does to increase your confidence of making changes.

How to break dependencies?

Adding tests is not always easy. Often, it is necessary to break dependencies.

The book contains a catalogue of dependency breaking techniques:
  • Extract Interface
  • Adapt Parameter: wrap a parameter behind an interface
  • Parametrized Constructor: create a new constructor that takes the dependency and update the old constructor to use it
  • Break Out Method Object: move a long method to a new class
  • Extract and Override Call: extract a call to a new method and override it in a testing subclass (ideal to break dependencies on global variables and static methods)
  • Extract and Override Factory Method: extract hard-coded initialization work from a constructor to a factory method and override it in a testing subclass.
  • Parametrized Method: if a method creates an object internally, pass the object from the outside
  • Pull Up Feature: you can pull up a cluster of methods into an abstract superclass and you can subclass it to create instances in your tests.
  • Push Down Dependency: make a class abstract and create a subclass that will be the new production class and push down all problematic dependencies into that class.
  • ...
For languages like C and C++ it is possible to use some specific techniques like:
  • Preprocessing seams: use ad-hoc macros to replace behaviour in tests
  • Link seams: override behaviour linking to a different implementation in a test module
  • Replace Function with Function Pointers: use pointers in tests to swap real implementation with fakes
  • Template Redefinition (C++ only): make a class a template and instantiate the template with a different type in the test file.
In using these techniques it is very important to try to preserve signatures whenever possible! It is always better to do this work using pair programming.
Working in legacy code is surgery, and doctors never operate alone.
Consider reading the book Pair Programming Illuminated. Refactoring

One of the most important principle to keep in mind is the Single Responsibility Principle. It is important to find responsibilities and extract classes when required. You can start applying it at the implementation level. It makes it easier to introduce it at interface level later.

Try to describe the responsibility of the class in a single sentence.
The best way to get better at finding responsibilities is to read more books about design patterns and in particular to read more code.
You can identify responsibilities using the following techniques:
  • Method Grouping
  • Hidden Methods: many private or protected methods indicates that there is another class
  • Coupling between variables and methods
  • Sketch (dependency graph between methods)
When you do refactoring, remember to do just one thing at a time.
Programming is the art of doing one thing at a time!
Ask your partner to challenge you constantly asking: What are you doing?
Remove duplication as much as possible! 
You end up with very small focused methods. The goal is to have orthogonality in the system. Your system is orthogonal when there is only one place you have to go to make a change. Removing duplication often reveals design.
Remember, code is your house, and you have to live in it.
Rename Class is the most powerful refactoring. It changes the way people see code and lets them notice possibilities that they might not have considered before.

Consider the Command/Query Separation Design Principle
A method should be a command or a query, but not both. A command is a method that can modify the state of the object but that doesn't return a value. A query is a method that returns a value but that does not modify the object.
Extract Method is a core technique for working with legacy code. You can use it to extract duplication, separate responsibilities, and break down long methods. It is also recommended to extract methods in a bottom up approach. Extract small pieces first!

Conclusion

It is true that working with legacy code can be frustrating at times. However, it is only a matter of perspective and approach. Working with legacy code can be fun and doing it using pair programming is a way to get a better result.

Surely, working with legacy code is a challenge and offers the opportunity to significantly improve your software developer skills. In any case, I totally agree with what Michael Feather say at the end of the book.

There isn't much that can replace working in a good environment with people you respect who know how to have fun at work

No comments:

Post a Comment

What you think about this post? I really appreciate your constructive feedback (positive and negative) and I am looking forward to start a discussion with you on this topic.