Domain Pollution

Apr 2018 by Neil McKinnon

Understanding how software pollution inhibits progress

Domain Pollution is often the result of poor software development practices, where one (or more) domains assumes another’s responsibility. It’s relatively common in monolithic applications, and exposes significant system failings. The figure below shows an example.

Example of Domain Pollution

This system has five distinct domains, D1 to D5. Note that all domains use D5’s data (I have only represented D5’s dataset, but it’s likely the other domains also manage their own datasets too).

Note — Single Responsibility Principle

Single Responsibility states that:
"... every module or class should have responsibility over a single part of the functionality provided by the software, and that responsibility should be entirely encapsulated by the class. All its services should be narrowly aligned with that responsibility."

This is not the case here. Based upon Single Responsibility, D5 should only interact with D5’s data. However, D1, D2, D3, and D4 have also assumed a responsibility of D5.

Note — Hoodwinked
The term hoodwinked (“to trick, deceive, or mislead”) is a good definition for what’s happening here. It describes the confusion that D5 (and eventually an entire system) suffers when change is required, but is prevented or hampered.

This is one form of domain pollution. It hampers D5’s evolution, causing innovation, security, and scaling challenges. Let’s look at a real-world example.

Real-World Example

The figure below represents a real-world example. It shows a monolithic system, interacting with a (relational) database schema for data storage.

A real-world example of domain pollution

The system encapsulates Customer, Carts, Reporting, Entitlements, and Payments domains. In this case, the database is an RDBMS (relational). Due to a constrictive licensing model (scale v costs), this database may only (feasibly) reside on a single instance; i.e. no horizontal scaling.

Note how the domains all access the CARTS data directly. Typically, this occurs when one domain embeds an SQL statement that joins across domain boundaries, thus tying the two domains together. The listing below shows an example of this practice (embedded in the Customers domain), linking the CUSTOMERS and CARTS tables.

select * from CUSTOMERS c join CARTS ca on (c.id = ca.customer_id)
where ca.id = :cartId; // cartId is injected in at runtime

Now, let’s assume (as it often is) that this is a popular approach and it is repeated system-wide (e.g. Payments joins to Carts, Entitlements joins to Customers etc).

Seems harmless doesn’t it?

But what’s happened is that a usage assumption has been made (i.e. coupling). One domain has assumed another domain (a) is always accessible, and available, (b) that data will always reside in the same partition as it, and (c) that the same technology/vendor will be used for both domains.

Now consider this scenario. A new client signs up. They have significant scaling needs, mainly around cart management. To further exacerbate matters, the client also needs to store additional (proprietary) properties (e.g. location-specific tax information so they can run reports on it) when an item is added to the shopping cart.

We now face two problems:

We must make changes to scale the carts solution (scalability), and,
We must allow proprietary data to be added to the cart (evolvability).

Scalability

"Just join domainA and domainB’s tables, and you’ve saved yourself a database trip."

Beware! This argument suggests that you’ll improve scalability, since only a single database interaction is required (database interactions are generally expensive).

This is true, and false. You probably will improve scale… BUT ONLY TO A POINT. This advice advocates a vertical scalability strategy over a horizontal scalability strategy (the stronger of the two); thus, opting for the weaker alternative.

Note — Cachable Joins

There’s a second argument to using table joins; this time across similar domains.

Consider a payments application. It may consist of several domains, including:

Payment Methods — what the Customer uses to purchase something (e.g. VISA Card 1234 5678 9012 3456, or PayPal account abcdef123)

Payment Providers — the type of payment providers selectable by a Customer (e.g. PAYPAL, VISA).

The following SQL might seem acceptable (since they’re in similar domains):
select * from PAYMENT_METHODS pm
join PAYMENT_PROVIDERS pp on (pm.payment_provider_id = pp.id)
where pm.id = :paymentMethodId;
But, it’s unnecessary.

There are two different types of data at play here:

The customer-specific, and changeable, payment methods data, used to provide the customer with a service, and,

The reusable, relatively static payment provider data, used to configure how the system functions.

There’s no need to fetch, or join to PAYMENT_PROVIDERS. It’s static data that can be stored and accessed in-memory, saving a database join or retrieval.

Evolvability

Evolvability is a key, oft-forgotten, architectural quality. It indicates the ease with which a system (or part of a system) may evolve.

Typically, evolvability indicates the ability to modernize (e.g. replace a costly relational database with an open-source alternative) part of a system.

Returning to our real-world example, the need to support proprietary data (in the Cart domain) strongly suggests the replacement of the relational model with a more fluid (e.g. NoSQL) technology. Yet the domain pollution severely hampers us. The SQL joins have:

Tightly-coupled the domains to a database type (e.g. relational).
Tightly-coupled the domain to a specific vendor (e.g. Oracle, Postgres).
Made an assumption about where the data resides (i.e. on the same instance/machine).

If we are to make the change, not only must we identify every area of change, we must also:

Collaborate with all the domain experts to understand (and contain) the blast radius.
Plan the capacity for the change.
Fix/replace broken unit tests.
Restructure (potentially) large swathes of code in the affected areas (e.g. passing an id through a set of layers to the persistence mechanism).
Recompile and redeploy the domain.
Regression test a large area of the system, and any of those area’s dependents.
Load test it.

Note — Change Friction
I call this problem “Change Friction”. The ability to change becomes increasingly difficult, until the effort becomes greater than the benefit. This leads to dissatisfaction, poor morale, an unwillingness to change, and poor stakeholder confidence and engagement (i.e. the death of a product).

Business Agility

Much of what’s come before can impact Business Agility and Brand Reputation. In the example described earlier, domain pollution caused Change Friction.

I’ve seen cases where this friction is only resolvable by an individual (a “Brent” in The Phoenix Project), and that individual is unavailable for the next two months. This is a terrible situation. It affects your Agility and (potentially) Brand Reputation, and can leave your clients in a precarious position.

The example I used earlier could occur, and the business may be unable to support their desired level of agility. Thus, the business faces a dilemma; do they let the client down and lose their custom (possibly suffering reputational damage), or do they choose to hack further (proprietary) changes into the solution and exacerbate evolutionary issues?

Forms of Domain Pollution

Domain Pollution tends to be caused by the introduction of incorrect assumptions in software. Several forms of domain pollution exist.

Assuming Another’s Responsibility

One domain embeds a responsibility belonging elsewhere. For instance, suppose we need to email a customer a notification after a successful registration. Should these notifications be constructed in the Customers domain (e.g. get the email address, the email template, then inject dynamic content into that template), or the Notifications domain? Notifications, of course!

Yet, I often see this anti-pattern applied in unexpected places; e.g. another developer copies and pastes the notification logic from the Customers domain into the Carts domain, and now there’s two polluted domains and a duplication of logic (a code smell)

Assuming Another’s Data

I presented this example earlier (e.g. cross-domain SQL pollution), so I won’t repeat it here.

Exposing Consumers to Internal Details

In this article on API design, I described the importance of the “Don’t Expose your Privates” practice. In this case, an API exposes unnecessary details to external consumers, who become tightly coupled to those details. Another anti-pattern is to build API flows directly from internal table flows; e.g. the APIs mirror the navigation routes through internal tables. This adds (avoidable) complexity to an API integration, and also reduces evolvability.

If you think it’s difficult to make internal evolutionary changes, wait till you attempt it with external integrators! There may be hundreds of integrators. All evolving at different rates and wanting different things (indicating that we have different levels of control over their evolution). Some may be corporate heavyweights, and (unless you’re careful) can dictate evolutionary terms to YOU.

Problem Round-Up

Many developers tend to focus on solving immediate problems, and don’t necessarily appreciate that a single, seemingly innocuous, assumption can significantly impact a system, and thus, a business.

“Clean” domains can protect Business Agility and Brand Reputation, with flexible, evolvable, scalable, and secure software. Pollution can do the opposite.

I witnessed the ultimate form of domain pollution whilst analysing potential replacements to a large, monolithic application. The common approach to solving this problem is to break the monolith into smaller units (typically microservices), by identifying the seams and strangling (the Strangler pattern) each, one at a time.

However, I found this approach ineffectual due to significant domain pollution. Domain Pollution hindered evolution, and (to me) was the key TECHNICAL reason for the product’s demise. Change Friction dictated our direction, and resulted in the construction of a new (costly) product.

The Solution

This article (I hope) clarifies why we should (where practical) avoid domain pollution. So, what can we do about it?

The answer is to (a) encapsulate the data store, so others can’t access it, and (b) always use an interface (e.g. REST API), and never undermine it through circumvention; i.e. follow standard encapsulation practices!

Note — Encapsulation
Consumers should treat other domains as a black box; you can put things in it, take them out, but you never know (or care) how it does it.

One approach to the database pollution problem is to (where practical) force the issue. For instance, by intentionally selecting to use a different data storage technology per domain (e.g. Customer data in a relational Postgres database, Cart data in a DynamoDB NoSQL database), we can prevent a direct data-level “JOIN”. Of course, this option isn’t always available (or sensible).

Another option is to employ user privileges; e.g. the Customer domain uses CUSTOMER privileges and can only access customer tables; the Cart domain uses CARTS privileges, and can only access carts tables. Note that this approach has a positive side-effect as it also protects each domain’s data from “polluted domain injection attacks” (i.e. hackers can’t access both CUSTOMERS AND CARTS data through a flaw in one domain as the data privileges protect it).

Hint — Finding Evolutionary Issues
Look across your entire business for anything that unexpectedly couples itself to an underlying data model (it need not be software; possibly an individual manually eyeballs the data, and you must find ways to alleviate their concerns over change). Consider what assumptions they make. Are they fair? Will they remain reasonable? Will they hamper evolution in the next three to five years?

Finally, don’t drive API design from the bottom-up, but from the top-down (i.e. consumer-driven). Let consumers drive flows, and models (if possible). Consider the need, name, and purpose of every data field before exposing it.