Caching strategies to improve performance in microservices architectures

Performance is key as companies may lose customers and even have to close if their applications are slower than competitors’ ones. We will see some strategies to improve it using caching in microservices architectures


We all know the experience. We want to buy something online and the website takes forever to load, mostly if it is a big sales day like Black Friday. We may be patient and try again a few times but many people will just look to another site that sells something similar. And the same happens with mobile apps, we are probably going to look for another if it takes too long to load.

Now imagine you are on the other side as a company. It is your biggest sale day of the year, have spent months preparing offers and may have made huge investments in marketing campaigns. And then the day comes and you don’t sell anything because the website doesn’t work and customers go to your competitors instead. Imagine the conversations with stakeholders in the following months, considering that you still keep your job as the company may have to close if the margins were already tight and depended on the sales from that day.

Can we do something to improve it? Sure. Multiple causes would make your application go slow and can be solved separately (bad queries, using legacy systems, slow infrastructure…) but caching can be a quick win that would overcome some of these challenges.

What is caching?

We usually have to retrieve and aggregate the same pieces of data or repeat the same calculations, sometimes even hundreds of times per second although the results won’t change. We can instead store the results in-memory or in a centralised place so we prepare them once and only get their values the following times. This saves a lot of time as accessing cached data is quite fast whereas accessing multiple services to get and process data is very slow, which is even worse in microservices due to the multiple requests between services. And it prevents overloading the system with unnecessary queries and extra costs.

We cache results while they are relevant, which may be done in different ways depending on the scenarios. E.g. online newspapers may reload their news every hour, shops may reload offers once per day and their stock every few minutes, and trading companies may have to get fresh currency conversion rates every second. We will see four approaches.

In-memory cache

If you have only one instance of a service or don’t mind caching the same data in multiple instances, you could store it in the service’s memory. There are multiple technologies to support it like Caffeine or EhCache. You will have to think about how much data is stored and how often it is evicted as your services could run out of memory.

Distributed cache

When there are multiple instances of a service it is usually prefered to store the data in a centralised cache that is used by all the instances. This way data is fetched only once and later used by all the instances. However, it can be slower than the in-memory solutions due to the network jump to connect from the instances to the cache, and it may require additional infrastructure. There are different technologies for it like Hazelcast or Redis and specific libraries to work with complex objects like Redisson.

Per-request cache

This is for cases in which data changes quite fast and we need consistent results during the same requests. E.g. if we are processing complex financial transactions and have to fetch the same FX rates multiple times. In these cases, we can store them in a map that is passed between methods but there are more elegant solutions like this one.

Indexed searches

Sometimes we have to process a big amount of data from different sources and search on it using different filters. An example would be processing all the data from a Kafka topic that may have millions of records. We can process all the data once (e.g. with an overnight job) and store it indexed so services read from the cached data instead of searching it in the topic. There are different solutions for it like Elasticsearch.


It may not be easy to solve all the performance issues of your system but caching usually helps with it. We have seen different approaches that hopefully will help you choose the right one for each scenario.

Rafael Borrego

Consultant and security champion specialised in Java, with experience in architecture and team management in both startups and big corporations.

Disclaimer: the posts are based on my own experience and may not reflect the views of my current or any previous employer

Facebook Twitter LinkedIn 

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>