Modularization and semantic versioning

Carlos Morales • 2020-11-11

Practically there is no software company or public software library that does not use modularization and semantic versioning. In this post, I briefly describe why these concept are so important and share some lessons learned while using them.

Modularization

Based on Wikipedia: ”Modular programming is a software design technique that emphasizes separating the functionality of a program into independent, interchangeable modules”. Instead of “modules”, it could have said “libraries”, “cloud services”, “frameworks”, etc.

This technique dates to early software systems. With that said, in the last years, there has been an explosion of this concept: microservice architectures and cloud architectures are heavily based on independently developed modules abstracting the underlying complexity.

Modularization is one of the most important pillars of modern architecture.

Benefits of modularization

Modularization is associated with many advantages, for me, these are the most important benefits:

Code reuse: apply a scaling factor by composing components for solutions to new problems.
Single-responsibility principle: modules are responsible to resolve one well-defined problem.
Maintainability: as modules tend to be smaller than bigger monoliths, these modules are easier to understand and maintain.
Abstraction: modules can focus on a very detailed and narrowed functionality, abstracting the complexity away from other applications.
Less complex problems: separate and decompose problems into smaller chunks of less complex problems.
Split ownership: multiple teams can work together, each team focuses their module.

Disadvantages of modularization

Although modularization has proven to be very valuable, there are some disadvantages we need to focus on. One module is less complex, the whole system with a group of independent modules becomes instantaneously more complex:

State: keeping a global state across multiple modules is extremely difficult.
Coordination: coordinating modules that are loosely coupled is intrinsically more complex than one big program that requires no coordination with other systems.
Testing: verifying that multiple modules work jointly, requires more effort in integration tests.
Release cycles: independent modules have independent release versioning. Defining which versions are compatible requires an extra level of management.

I believe these trade-offs are acceptable, the question is not whether modularization is useful or not, but how far the splitting process goes, as over-engineered systems also become unmanageable. If there is no reason to split, do not modularize until it becomes necessary. And when there are too many modules to manage, focus on orchestrating them.

Application Programming Interfaces

There is one common trait in the previous list of disadvantages: integration. Modular integration is only possible if each module has good perimeters, these perimeters are commonly known as Application Programming Interfaces (API). Our goal is to build API-focused systems.

Good public APIs enable modular integrations.

The Bezos API mandate

Jeff Bezos wrote an email to all Amazon employees in 2002. It got known as the “Bezos API Mandate” or “Amazon API Mandate”. In his six lines memo he requested all teams must provide service interfaces and ”All service interfaces, without exception, must be designed from the ground up to be externalizable.”, otherwise ”Anyone who doesn’t do this will be fired.“.

I believe that memo is the key aspect that allowed the great success of Amazon Web Services (AWS). This set the correct mentality, all the software built in-house was meant to be used by other teams, and at some point, they decided other teams could be other companies. The rest is history, it brought a real boost to cloud computing and since then almost every IT company is moving its infrastructure to the cloud. Currently, AWS is a much more profitable division than the actual Amazon store.

Building software focusing on the public interface was a great decision for Amazon. You may say ”Ok, it worked for Amazon, could it work for any other software company?” I believe yes, actually in my opinion there is no other way.

APIs allow modularization

APIs provide a reliable contract, allowing teams to produce isolated modules that can be reused by other teams. Modules are a responsibility assignment, not a subprogram.

It allows to break down of an application into smaller pieces, each piece is focused on providing a single functionality.
Well-defined APIs allow decoupling teams from one another, improving the focus and agility of the teams.
Decoupling modules enables better scaling, as they can be reused multiple times by multiple teams.

Semantic versioning

In systems with many dependencies, releasing new package versions can quickly become a nightmare. Semantic versioning is a solution to solve this problem, it helps to understand the potential impact of updating to a new version. The usage of semantic versioning allows teams to regularly update dependencies, a key requirement for modern distributed systems with many dependencies.

As a very brief definition, semantic version divides the released version in three numbers: MAJOR.MINOR.PATCH, these numbers increment based on the public API.

MAJOR version when you make incompatible API changes
MINOR version when you add functionality in a backwards compatible manner
PATCH version when you make backwards compatible bug fixes

It establishes a communication between the creator of the module/library/… and the consumer: as a consumer of a module, the version tells you the impact of updating the dependency. As a module producer, this tool lets you to express the changes done in the API. As a consequence, the producer tells their consumers how much effort to spend when updating the dependency.

Semantic versioning became the defacto versioning method on open-source projects for a good reason.

Lessons learned when using semantic versioning

My team is responsible for building a set of UI components that many other teams use. Around five years ago, we decided to stabilize our APIs and introduce semantic versioning. It proved to be one of the best decisions we made.

These are some of my lessons learned across these years:

Always respect the contract: the semantic versioning acts as a contract of the public API. If the contract is not reliable it becomes useless.
Minimize breaking changes: semantic versioning focuses on breaking changes. As reliability is a key aspect, preventing frequent breaking of changes is the cornerstone for building good APIs.
Plan ahead: if you still need to modify the public API, give enough time to the consumers. Group all breaking changes and mark a date when they will be introduced. We tend to plan major releases every 6 months.
Deprecate first: mark the future breaking changes API as deprecated and give enough time for the consumers to update/adapt and potentially give feedback.
Good documentation: if the consumers do not have a change log or/and migration guidelines, the semantic versioning is incomplete. They must know what to do when doing a major upgrade.
Automate tests on public interfaces: introducing unnoticed breaking changes is quite catastrophic, consumers can lose trust in the versioning. Verify the public API does not include unnoticed-breaking changes with automated regression tests.
More often and more regular upgrades: encourage your consumers to update regularly. This reduces the risk enormously, as it is much easier to spot any problem.
Help migrating breaking changes: if you must introduce any breaking change, help your consumers as much as possible: propose alternatives, build tools to automate the process, share others experience, etc.

Stability ensures that reusable components do not become obsolete unexpectedly.

Stability was essential for us and the ecosystem around our framework to thrive.

Conclusions

As distributed architectures become more complex, modularization is a key technique to manage and abstract that complexity.

If we manage the dependencies correctly, the modularization benefits will justify the extra complexity.

When creating reusable components, focus on healthy public APIs. My own experience proves it.

If there is one takeaway from this article: focus on well-defined APIs and prevent adding breaking changes as much as possible.