One of the challenges of the microservices architectures is that the data is managed by different services what makes difficult to do queries across multiple domains. Let’s see a few approaches and the advantages and drawbacks of each of them.
Let’s see first an example. Imagine that we have a finance system that displays lists of invoices, for example twenty per page. The domains may be clear, for example customers, users and invoices, so we could have a service for each with isolated data. The challenge comes when we have to do filtering or sorting. For example we may want to sort or filter by invoice due date, customer name or employee name. In this case we would have to do queries across different domains that shouldn’t be done by any of the individual services. How can we sort it out?
Building a macroservice
This is the approach that some teams take when they migrate their first microservices from a monolith and is also known as “lift and shift”. They want to replicate the functionality that they had before without doing breaking changes so they just copy the queries that involve multiple domains. As the name suggests, this is not a microservice as each service should handle a separate domain and what they are actually building is a micro-monolith.
It has a few drawbacks:
- It doesn’t allow to break the data into separate schemas
- It doesn’t allow to deploy the changes of a domain separately
- Other parts of the application may want to use one of the domains and they shouldn’t have to use a big service or be affected by its changes (for example a change of the version)
- It goes against the single responsibility principle because when a team has to do changes to one of the domains they shouldn’t have to do also changes to the code used by other domains.
There is one scenario in which this approach would work, when they know that a few domains usually work together (invoices, payments, …) and are quite separated from the rest of the application. They may first move all the code to a big service, and then extract each domain in different steps until the original service becomes an orchestrator that handles the communication between them. Instead it is usually better to extract each service first and call to the them from the monolith and then create the orchestrator.
Including an entity of other domain
This approach is taken when some teams face the pagination issue for the first time or when they have do deliver a feature fast to meet a deadline. They have a domain separated from the others but they add a reference to a table of other domain.
It may work for the short term but it is like shooting on their foot because they are loosing the advantages that they had until now. Some of the drawbacks are:
- They are creating a distributed monolith because a change on a domain may affect others and they may not be able to deploy the services separately anymore
- They won’t be able to move the tables to separate schemas
- They are adding an endpoint that allows to do things that the service shouldn’t do. They will want to remove in the future what will make them increase the version and update all its consumers.
Duplicating data of other domains
Let’s say that the domains are quite separated except from one column. In the initial example of invoices we could think that the screen will only filter and sort by invoice fields and by the customer name. In this case we could add a column with the customer name to the invoice table so it is used by the queries. this would work for the short term but would have a few drawbacks.
- Any change on a customer name would have to be applied not only to the customer service but also to the invoice service, and the team that does the changes may not know that they have to do them in different schemas
- People may get used to this hack so the team may continue adding more columns to the table although they know that they shouldn’t be there
- Like on the previous case, the interface would offer endpoints that it shouldn’t have and that would have to be removed in the future affecting its consumers.
Doing separate requests to each service
This can be done in the orchestrator. If we know that we have to filter by customer and by invoice date, we could first get the users of that customer calling to the user service and then get their invoices calling to the invoice service passing the user list as a parameter.
The good thing of this approach is that the domains are separated and won’t have side effects. The drawback is that we don’t know how many of the elements returned by the first service may have references in the other domain, so we don’t know how many objects we have to obtain initially and there may be thousands in the database. Due to this we would have to do many queries to populate each page.
Using a view
We could have a view that has the columns used by filtering and sorting and the ids of the invoices. The orchestrator would query this table to get the ids and then call to the invoice service to get their data. It is a simple solution that allows to have the data of each domain in separate schemas.
The main drawback is that doing changes in a domain schema may break the view and the team doing the changes may not be aware of it.
Using a table populated based on events
In some types of architectures each service publishes in a topic the changes of the data. We could have a service that monitors the topics, listens to the events and populates a table in a separate schema that would have the columns used to sort and filter. This way the orchestrator would use this table for getting first the ids of the invoices to display and then would do a call to the invoice service to get the data of the invoices.
This is the approach that decouples most the services. However it is more complex to implement and can only be done if the services publish events about data changes, what may only happen in teams that have been some time building microservices. It may also have the drawback of having outdated data as it may take some time between the moment when the data is changed in the domain tables and when it is changed in this new table.
We have seen different approaches to do it, from the initial ones that have side effects to the last one that is cleaner but more difficult to implement. It is up to the team to discuss the tradeoffs and choose the one that suits better for their system.
Do you have experience with these cases? Have you taken any different approach? If so I would love to read it in the comments below.