SOLID Part Two. The Open Closed Principle and Liskov Substitution Principle.
Open Closed Principle
The open-closed principle means that a software artifact should be open to extension and closed to modification. The meaning of this principle is that software artifacts should be easily extended by adding new code and without modifying existing code. By software artifact I mean anything from a function, a class to the entire architecture of a software application. The idea behind the OCP is to guard software against changes in requirements. If a small change in requirements forces large changes on our codebase then we have a problem. The first is that changing a lot of code requires a lot of work. It’s just harder to change existing code than to write new code. As well as being harder to change existing code, it is much harder to review changed code than new code. This is partly because a reviewer needs to make sure the previous intended behaviours still hold true with the changes as well as making sure the new behaviour work as intended.
So how do we make our architecture open to modifiction but closed to change? Firstly we organise our software into modules according to their responsibilities, using the single responsibility principle which was discussed in my previous post. We do this in order to make sure that functions and data that need to change for different reasons are grouped together and therefore code is protected from changes that is not relevant to it. Secondly, we organise our dependencies so that the lowest level modules depend on the higher-level modules. This means the higher-level rules don’t need to change at all when the requirements for the lower level details change. I’ll go through an example using this methodology.
Imagine we have a simple system which receives some sort of event, processes it, enriches it with some other data and then stores it. Applying the open-closed principle we may come up with an architecture like so:
type EnrichmentRequest struct { ... }
type EnrichmentResponse struct { ... }
type Enrichment interface {
RequestEnrichment(req EnrichmentRequest) (EnrichmentResponse, error)
}
type Store interface {
StoreEvent(e Event) error
}
type Event struct { ... }
type Processor interface {
ProcessEvent(e Event) error
}
type EventProcessor struct {
enricher Enrichment
store Store
}
// Implements the Processor interface using the enricher and store
func (eventProcessor EventProcessor) ProcessEvent(e Event) error { ... }
type HttpServer struct {
processor Processor
}
func (s *HttpServer) handleNewEvent() {
//Receive and extract event from request
...
err := s.processor.ProcessEvent(event)
//Return correct status code to client
...
}
In this code snippet we have two modules, the first is the EventProcessor
and the second is the HttpServer
. In a real application there would be separate modules for the implementation of the Store
and the EnrichmentProcessor
but their implementation is irrelevant here so we don’t need to show them. And the fact that we don’t need to show them in order for this piece of code to make sense demonstrates the value of the open-closed principle. The highest level logic of the application is contained within the EventProcessor
and it doesn’t care about how data is stored or how events come into the application. The application could have a gRPC server added alongside or to replace the HTTP server and the EventProcessor
shouldn’t need to have a single line of code changed. The application should be able to have a simple memory-based Store
for prototyping which can be easily replaced with a Store
which uses a database when it’s needed without the EventProcessor
needing to have a single line changed.
The key to achieving this is to make sure the direction of source code dependencies points from the lowest level to the highest level modules. If you imagine the architecture of the application as an onion with the highest level business rules as the centre and the surrounding layer being the application-specific business rules. Then the outter layer is the lower-level components that deal with protocols and implementation details. In order to achieve the desired properties of the open-closed principle the source code dependencies must point inwards through the onion. The source code dependencies shouldn’t point outwards and ideally one lower-level module should depend on only one higher-level module.
This is how the open-closed principle works are the architectural level. We separate the code into modules based on their responsibilities. We have made sure to protect the higher-level module from changes in the lower level modules by making sure that our source code dependencies point only towards the higher-level module. Thus we have made this system easy to extend while requiring minimal changes.
Liskov Substitution Principle
The name for this principle comes from Professor Barbara Liskov of MIT who first introduced the principle in 1987. The Liskov Substitution Principle means that if a class U
is a subtype of T
then objects of type T
may be replaced by objects of type U
without changing the desirable properties of the program. The most desirable property of a program is correctness, usually. Liskov is about guiding the use of inheritance. A simple rule of thumb to remember is that if type U
inherits from type T
then U
should have an is a relationship with T
. If U
is not a T
then it shouldn’t be inheriting from it. This is because U
could have a different behaviour that results in the program becoming incorrect. Here’s a quick example that demonstrates this.
// Abstract class Shape with one method Area()
type Rectangle interface {
Area() float32;
SetHeight(height float32);
SetWidth(width float32);
Width() float32;
Height() float32;
}
type Square struct {
side float32
}
func(s *Square) SetSide(side float32) {
s.side = side;
}
func(s *Square) Area() float32 {
return s.side * s.side;
}
func(s *Square) SetHeight(h float32) {
s.side = h;
}
func(s *Square) void setWidth(float w) override {
d_side = w;
}
func(s *Square) Width() {
return s.side;
}
func(s *Square) Height() {
return s.side;
}
Now imagine somewhere in your program you had some code like this:
func MultiplyArea(r Rectangle, multiple float32) {
r.SetWidth(r.Width() * multiple);
}
If the rectangle passed into this function is actually a square then the program will be incorrect. If it’s a square with sides of length 2, it has an area of 4. If we want to increase the area by 10% then we can make a call like MultiplyArea(s, 1.1)
. If the given rectangle is a square then each of it’s sides will be set to 2.2 and the area will become 4.84 instead of 4.4. This means the program is incorrect.
In order to correct the program a developer may start to introduce checks for what type of rectangle it is. Meaning that every time an operation is carried out on a rectangle there will be two different paths. This will start to infect the codebase, adding extra bloat. Eventually, someone may write a function which forgets about the subtle difference between a square and other rectangles and introduce a bug.
The rectangle/square problem is the example that is normally brought up when discussing the Liskov Substitution Principle. The reason is because a square in mathematics is a rectangle. However when deciding whether a class can inherit from another the is a relationship depends on the use of the class. The usage in our example for a rectangle is as follows:
- an object with a height and width
- the height and width can be independently set
- has an area Clearly a square can’t have it’s height and width independently set so a square in this instance is not a rectangle.
So how do we fix this situation? It depends what we want. If we only need to use four-sided shapes then we can simply have something like this.
type interface FourSidedShape {
Area() float32
GetHeight() float32
GetWidth() float32
}
//Then implement both rectangle and square
As well as applying to class design, Liskov also applies at the architectural level. For example if we have a set of services that are meant to conform to a RESTful API but one differs slightly from the others then we will get the same problems as before. The key idea is that we have any sort of interface and multiple software artifacts implementing this interface then they should be able to be substituted for each other.