Ever tried using a distributed cache in your Spring Boot Application? It will be useful, ones you want to scale up your application. This post will focus on the implementation and on some mistake I made while using Hazelcast.
Why to use a distributed cache?
If you want to run your application multiple times simultaneously, you probably need to share data between each instance. There are various ways to do so. You could for example set up a database which holds the data you want to share between your applications. Of course, you then have to consider the consistency of your database when multiple instances will access it. There are many articles about database scalability concepts, so I won't cover them in this post. I recommend you this post if post about database scalability you want to dive deeper into databases.
Another approach would be to share the data between each instance with the help of a distributed cache. Each instance than got it's one cache which either gets synchronized by connecting to every other instance (Embedded-Cache-Topology) or by getting updates from a central cache cluster (Client-Server-Topology). Furthermore, there is a third topology (Near-Cache-Topology). This topology also has a cache cluster, but every instance got its own "near-cache". The cache cluster will be used to update the cache if a cache miss happens on an instance.
Which option do I have to use a distributed cache?
There are multiple frameworks / tools which are designed to be used while trying to archive a distributed cache. So... no worries, you don't have to implement it on your own. The most famous names are:
- Redis
- Memcached
- Hazelcast
Redis and memcached can be easily used within the Google Cloud Platform environment. Both are mainly used within the Client-Server-Topology. However Hazelcast is mostly known for it's use within the Embedded-Cache -Topology. That's what made Hazelcast interesting for me.
Why to use Hazelcast
As mentioned before, Hazelcast is capable of being used in an Embedded-Cache-Topology. You don't have to set up a cluster server. Each hazelcast instance will register itself to all other instances. The same works for a hazelcast instance that shutsdown.
How to use Hazelcast
To get started with Hazelcast, just add the dependency to your pom.xml. At the moment, I am using this dependency:
<dependency>
<groupId>com.hazelcast</groupId>
<artifactId>hazelcast-all</artifactId>
<version>4.2.5</version>
</dependency>
Furthermore, if you are planning to use it together with Hibernate, you need to declare the cache provider you want to use. This can be done via application.yml. An example can be seen here:
spring:
cache:
type: jcache
jcache:
provider: com.hazelcast.cache.impl.HazelcastServerCachingProvider
Optionally you can add a hazelcast.yaml configuration file in your resources folder to define a cluster name for example. If you then want to get your configuration file considered on the startup, you need to add the following configuration bean to your code:
@Bean
public HazelcastInstance getHazelcastInstance() {
Config config = Config.load();
return Hazelcast.newHazelcastInstance(config);
}
After that, you can freely use Hazelcast as you want. I wanted to get rid of Maps which are hold within a class, since you can't scale your application up with this construct.
So all you need to change is from:
@Service
public class ServiceWithMemoryMap{
private Map<Long, Boolean> someMap;
public ServiceWithMemoryMap(){
someMap= new HashMap<>();
}
//... some methods
}
To this:
@Service
public class ServiceWithMemoryMap{
private Map<Long, Boolean> someMap;
public ServiceWithMemoryMap(HazelcastInstance hazelcastInstance){
someMap= hazelcastInstance.getMap("someMap");
}
//... some methods
}
And now we got the map "someMap" in the Hazelcast cache and every instance of our application will get the map with the same value. Feel free to try it out by running two (or more) instances locally. You can do it on the terminal by using those command:
mvn spring-boot:run <-- First instance
SERVER_PORT=8081 mvn spring-boot:run <-- Second instance on different port
Pitfalls when using Hazelcast
This last chapter contains some pitfalls in which I ran into. So keep an eye out for them.
Forgot to define a cache provider when using Hibernate
If you don't define a cache provider (like described above), you will get some wild exception when using Hibernate and Hazelcast simultaneously.
Objects in Memory need to be serializable
A small but important thing, which isn't really noticed in most beginner tutorials for Hazelcast is, that every Object you want to hold in Memory needs to be serializable. To do so, implement the Serializable Interface on your class.
Putting a class variable with a Future in it (Map<Long, Future<Boolean>>...) to Memory
As I mentioned before, every Object you put into the Hazelcast Cache needs to be serializable. So one day I tried to convert a classhold memory map to a map which lives inside the hazelcast cache. This map contained a Future as a Value. Logically, you can't serialize a future. So I came up with a pretty (actually it's dirty) hack. I converted the Map into the following:
Map<Long, Boolean> convertedMap;
And every time I put a Future value in the map in the past, I now create a new Map entry with the value 'False'. After that I created a new Thread, which observes the Future object. Whenever the Future object is on state 'Done' I will retrieve the new Map and set the value to true via the Hazelcast Cache. Here you can see the new Class which handles this behavior:
public class ConvertedMapSetter implements Runnable {
private final Future<Boolean> finished;
private static final Long SLEEP_TIME = 100l;
private final Logger log = LoggerFactory.getLogger(ConvertedMapSetter .class);
private final CacheService cacheService;
private final Long key;
private final ZonedDateTime startingTime;
public ConvertedMapSetter (Future<Boolean> finished, CacheService cacheService, Long key) {
this.finished = finished;
this.cacheService = cacheService;
this.key= key;
startingTime = ZonedDateTime.now();
Thread ConvertedMapSetter = new Thread(this);
ConvertedMapSetter .setName("ConvertedMapSetter -" + key);
ConvertedMapSetter .start();
}
@Override
public void run() {
boolean timeOutReached = false;
boolean isFinished = false;
while (!timeOutReached && !isFinished) {
if (ZonedDateTime.now().isAfter(startingTime.plusMinutes(15))) {
log.error("ConvertedMapSetter is running for more than 10 minutes. This thread will be shut down. Key processing: {}", key);
timeOutReached = true;
}
if (finished.isDone()) {
isFinished = true;
Map<Long, Boolean> convertedMap= cacheService.getMap("convertedMap");
convertedMap.put(key, true);
}
try {
Thread.sleep(SLEEP_TIME);
} catch (InterruptedException e) {
log.error("error in ConvertedMapSetter ", e);
Thread.currentThread().interrupt();
}
}
}
}
Feel free to comment a better solution to this approach. I would be more than happy to get more ideas about how to handle this.
Using Hazelcast within multithreading
This problem did cost me waaay too much time. I stumbled upon this randomly when accessing a list in the cache inside a parallel stream. Hazelcast throws some strange SerializationException, even though you serialize everything correctly. The problem therefore is not within Hazelcast, but within spring boot. I was using version 2.1.6.RELEASE. To be honest, I can't really explain what specifically the problem is, but I observed that when running in a multithreaded environment, the classloader of each thread is different. So in order to avoid this problem, you have to set a classloader inside your configuration bean. Your bean should then look like this:
@Bean
public HazelcastInstance getHazelcastInstance() {
Config config = Config.load();
config.setClassLoader(Thread.currentThread().getContextClassLoader());
return Hazelcast.newHazelcastInstance(config);
}
Sadly, I couldn't find a way to set this via properties, so this has to be done via code. I noticed, that in the latest Spring boot release, the classloader is set. I didn't verify yet if there are still problems occuring.
Using a Collections inside a Collection
Another thing, which sadly also cost me way too much time, is that you are not able to store a Collection, which contains a collection inside the cache. You have to do a slight workaround to get it to work. For example, you have to transform this code:
@Service
public class ServiceWithListInMap{
private Map<Long, List<SomeObject> someMap;
public ServiceWithListInMap(HazelcastInstance hazelcastInstance){
this.hazelcastInstance = hazelcastInstance;
someMap= hazelcastInstance.getMap("someMap");
}
//... some methods
}
Into two classes. The result will look like this:
@Service
public class ServiceWithListInMap{
private Map<Long, SomeObjectWrapper> someMap;
public ServiceWithListInMap(HazelcastInstance hazelcastInstance){
this.hazelcastInstance = hazelcastInstance;
someMap= hazelcastInstance.getMap("someMap");
}
//... some methods
}
public class SomeObjectWrapper{
private List<SomeObject> someList;
public SomeObjectWrapper(){
// ...
}
//... some methods
}
Conclusion
Hazelcast is a really fun thing to work with. Getting started with it is pretty easy. Doing some tweaks on it isn't that easy, since it's hard to get the right settings from the documentation. However, Hazelcast is an important tool to know of if you want to scale or / and speed up your application. I'm happy I found and learned about it. The team around Hazelcast did a fantastic job.