Regional Level of Memcache
Learn what problems and optimizations we can make on the regional level of Memcache.
Introduction to the regional level
We call a data center a region, and a region is a collection of multiple clusters. At the cluster level, the dominant concern is sharding the key space (using consistent hashing) and grouping the keys into appropriate buckets (for example, viral keys vs. dormant keys and high-churn keys vs. low-churn keys). At the regional level, our main concern will be the replication of keys to meet the overall load.
Consistency concerns come with replication. At the regional level, we must maintain consistency between Memcached and storage clusters (we will provide read-your-writes consistency at the regional level). How can we invalidate stale cached data that has been updated in the storage cluster? These cross-cluster problems are going to be discussed in this lesson.
Overview of design problems at the regional level
To manage the high workload, we add multiple front-end clusters that use the same storage cluster, but to do this, we need to manage replication and data consistency.
If we scale a single cluster naively, our networks start to face incast congestion. Rather, we can replicate clusters when the load becomes too high.
Replication and consistency: when we have multiple Memcached servers caching the data from the same storage cluster, how can we ensure that all the Memcached servers are up to date?
Should we send invalidations through web servers?
Should we send invalidations from the storage clusters to the Memcached servers for items that are no longer up to date or need to be ...