Adding Caching

Implement caching in the Django application for optimized data retrieval.

In software computing, caching is the process of storing copies of files in a cache so they can be accessed more quickly. A cache is a temporary storage location that stores data, files, and information concerning software that is regularly requested.

A great example and explanation of caching comes from Peter Chester, who asked the audience at one of his speeches: “What’s 3,485,250 divided by 23,235?” Everyone fell silent for a moment, but someone pulled a calculator and yelled out the answer, “150!” Then, Peter Chester asked the same question again, and this time, everyone was able to answer the question immediately.

This is a great demo of the concept of caching: The computation is only done once by the machine and then saved in quick memory for faster access.

It is a concept used widely by companies, especially social media websites where millions of users access the same posts, videos, and files. It would be very inefficient to querying the database whenever millions of people want to access the same information. For example, if a tweet is gaining traction on Twitter, it is automatically moved to cache storage for quick access. And, if you have an influencer such as Kim Kardashian posting a picture on Instagram, you should expect a lot of requests for this picture. Caching can be useful here to avoid thousands of queries on the database.

To summarize, caching brings the following benefits:

  • Reduced load time

  • Reduced bandwidth usage

  • Reduced SQL queries on databases

  • Reduced downtime

Now that we have an idea about caching and its benefits, we can implement the concept using Django and even Docker. But before that, let’s quickly discuss the complexity caching brings to our application.

The cons of caching

We already know the advantages of using caching, mostly if our application is scaling or we want to improve load time and reduce costs. However, caching introduces some complexity to our system (it can also depend on the type of application we are developing). If our application is based on news or feeds, we might be in trouble, because we will need to define a good architecture for caching.

On the one hand, we have the chance to reduce load times by showing our users the same content for a period, but at the same time, our users might miss important updates. This is where cache invalidation comes to the rescue.

Cache invalidation is the process of declaring cached content as invalid or stale. The content is invalidated, because it is no longer marked as being the most up-to-date version of a file. There are a few methods available to invalidate a cache:

  • Purge (flush): Cache purging instantly removes the content from the cache. When the content is requested again, it is stored in the memory cache before returning it to the client.

  • Refresh: A cache refresh consists of refreshing the same content from the server and replacing the content stored in the cache with the new version fetched from the server. This is done in the React application using state-while-revalidate (SWR). Each time a post is created, we call a refresh function to fetch data again from the server.

  • Ban: A cache ban does not remove content from the cache immediately. Rather, the content is marked as invalid. Then, when the client makes a request, it is matched with the invalid content, and if a match is found, new content is fetched again and updated in the memory cache before returning to the client.

With the cons of caching and how to invalidate the cache understood, we are well-equipped to add caching to the Django application. In the next section, let’s add caching to the Django API of Postagram.

Adding caching to the Django API

In the previous paragraphs, we have explored caching, its advantages, and the cons of the concept. Now, it’s time to implement caching within our Django application. Django provides useful support for caching, which makes the configuration of caching within Django straightforward. Let’s start by making the required configurations depending on our environment.

Configuring Django for caching

Using caching within Django requires configuring a memory cache. For the quickest read and write access, it is better to use a different data storage solution from SQL databases because SQL databases are known to be slower than memory databases (again, it depends on our needs). Here, we will use Redis. Redis is an open-source, in-memory data store used as a database, cache, streaming engine, and message broker.

We’ll review the configurations we need to make to start using Redis in our Django project, whether we are using Docker or not. However, for the deployment, we’ll use Docker for configuring Redis.

So, if you are not going to use Docker, you can install Redis on the Redis website.

Note: If you are working in a Linux environment, you can check whether the service is running using the sudo service redis-server status command. If the service is not active, use the sudo service redis-server start command to start the Redis server. If you are using Windows, you will need to install or enable WSL2. You can read more in Redis' documentation.

After installing Redis on your machine, you can configure caching in Django using the CACHES setting in the settings.py file of the Django project:

Get hands-on with 1400+ tech skills courses.