Testing a Basic Concurrent Cache: Implementation

Take a look at an example of a cache to understand the concepts of stateful properties.

Understanding the requirements

To use stateful tests, we’ll first need a stateful system to validate. The system we’ll use is a cache implemented as an Open Telecom Platform (OTP) gen_server. A common optimization pattern is to use an Erlang Term Storage (ETS) table for reads and ensure the writes are safe by making them sequential through calls to the gen_server. This creates a bit of contention on the write operations, so instead, we’ll try to write a cache that only uses ETS for all operations, with the gen_server only working to keep the ETS table alive. The simple conceptual model (a cache handling data like a key-value store) and an implementation dangerously accessing ETS tables concurrently makes this a great candidate to demonstrate stateful property tests. We’ll see the cache implementation and how to approach modeling it to find potential bugs it may hide.

Our cache will have a simple set of requirements:

  • Values can be read by searching for their key.
  • The cache can be emptied on demand.
  • The cache can be configured with a maximum number of items to hold in memory.
  • Once the maximal size is reached, the oldest written value is replaced.
  • If an item is overwritten, the cache entry remains in the same position even with a changed value.

Those are a bit unconventional. Most caches only care about evicting entries that were not accessed for a long time, whereas ours focuses on writes, and doesn’t even care about updates in its eviction policy. In this instance, this is okay, because we want to show how to model the cache and not necessarily how to write a good one. We’ll stick with these requirements that are friendlier to a succinct implementation.

Implementing the cache

In general, stateful tests are often used during integration tests. Stateful properties are mostly used in the project’s later lifetime, so the tests will be written after the program implementation. We’ll respect this by writing the cache implementation itself first, then we’ll put the system in place and add tests after the fact.

gen_server

We’ll start with a standard gen_server set of callbacks and public exports.

-module(cache).
-export([start_link/1, stop/0, cache/2, find/1, flush/0]).
-behaviour(gen_server).
-export([init/1, handle_call/3, handle_cast/2, handle_info/2]).

start_link(N) ->
    gen_server:start_link({local, ?MODULE}, ?MODULE, N, []).

stop() -> 
     gen_server:stop(?MODULE).

The process will be unique to the entire node because of the name {local, ?MODULE} name. Since all operations will be done in an ETS table, we can read from the cache using the table directly, assuming the table is named cache. We’ll give the table’s records a structure of the form {Index, {Key, Val}}, where Index ranges from 1 to the max value allowed, basically forcing the table to be used as a big 1-indexed array. Whenever we write to the table, we increment the Index value before doing so, wrapping around to the first entry whenever we fill the array.

Unfortunately, this does mean we’ll need to scan the table on every read operation, but optimizing is not the point here.

Table initialization

Let’s take a loot at the table initialization of the cache.

init(N) ->
    ets:new(cache, [public, named_table]), 
    ets:insert(cache, {count, 0, N}),
    {ok, nostate}.

handle_call(_Call, _From, State) -> {noreply, State}. 

handle_cast(_Cast, State) -> {noreply, State}. 

handle_info(_Msg, State) -> {noreply, State}.

Note a magic record {count, 0, Max} inserted in the table. That’s basically our index-tracking mechanism. Each writer will be able to increment it before writing their own data, ensuring the index is always moving forward. Also, note that the gen_server callbacks are otherwise empty since we don’t need them.

Reading from the cache

Next, let’s take a look at the find function to read from the cache.

find(Key) ->
    case ets:match(cache, {'_', {Key, '$1'}}) of
        [[Val]] -> {ok, Val};
        [] -> {error, not_found} 
    end.

Here the ets:match/2 pattern basically means “ignore the index” ('_'), “match the key we want” (Key), and “return the value” ('$1').

Note: The documentation of ets:match/2 contains further information.

Writing to the cache

Writing to the cache is a bit more complex. Let’s take a look at how it’s done.

Get hands-on with 1400+ tech skills courses.