...

>

System Design: The Distributed Cache

System Design: The Distributed Cache

Learn the basics of a distributed cache.

Problem statement

A typical system consists of these components:

  • It has a client who requests the service.

  • It has one or more service hosts that entertain client requests.

  • It utilizes a database for data storage, which is used by the service.

Under normal circumstances, this abstraction performs fine.

However, as the number of users increases, the number of database queries also increases. As a result, service providers are overburdened, leading to slow performance. In such cases, a cache is added to the system to deal with performance deterioration.

A cache is a temporary data storage that can serve data faster by keeping data entries in memory.

Caches store only the most frequently accessed data. When a request reaches the serving host, it retrieves data from the cache (cache hitWhen the requested data is found in the cache, the server responds with the data immediately.) and serves the user. However, if the data is unavailable in the cache (cache missWhen the requested data isn’t found in the cache, it’s called a cache miss.), the data will be queried from the database.

Additionally, the cache is populated with the new value to prevent cache misses in the future.

Service before using caching
1 / 2
Service before using caching

A cache is a non-persistent storage area used to store data that is repeatedly read and written, providing the end user with lower latency. Therefore, a cache must serve data from a storage component that is fast, has sufficient storage, and is affordable in terms of dollar cost as the caching service scales.

The following illustration highlights the suitability of RAM as the raw building block for caching:

We understand the need for a cache and suitable storage hardware, but what is a distributed cache? Let’s discuss this next.

What is a distributed cache?

A distributed cache is a caching system where multiple cache servers coordinate to store frequently accessed data.

Distributed caches are necessary in environments where a single cache server is insufficient to store all the data. At the same time, it’s scalable and guarantees a higher ...