Testing a Basic Concurrent Cache: Writing Tests

Write tests for the cache and test them by introducing a bug into the system.

With the cache complete, we can write tests for it that will show whether it works correctly. As we’ve seen earlier, stateful properties are executed in two big phases:

  1. Symbolic phase to generate the command set.
  2. Real phase to run the commands against the real system for validation purposes.

We’re going to follow the same pattern here, and we’ll start with a focus on the symbolic execution by setting up the model before adding validation rules and then running the property to see if it’s sound.

Building the model

The first step in coming up with a good stateful model is to think like an operator, meaning someone in charge of running or debugging our code in production. If people are to operate and run our code, they have to be able to understand what to expect out of it. Whatever expectations they form as operators are often mental models meaning that they think through the operations in their heads and make guesses as to what data or behavior they’ll get back out of the system. Whenever their mental model is wrong (“The system doesn’t do what we think it should”), we’ll get a bug report or a production incident.

If we’ve managed to explain how the system works to an operator in a way that is both realistic and simple, we’ve given them a reliable mental model to work with. That mental model is something we can try to encode as a property.

Interestingly, if a good way to come up with a model is to figure out how we’d explain our component to a human operator having to run it in production, the opposite is true as well. If we have to explain our system to someone, the model we used in our tests could be a good starting point. If our tests are complex, convoluted, and hard to explain, then the testing experience is likely to match their operational experience.Interestingly, if a good way to come up with a model is to figure out how we’d explain our component to a human operator having to run it in production, the opposite is true as well. If we have to explain our system to someone, the model we used in our tests could be a good starting point. If our tests are complex, convoluted, and hard to explain, then the testing experience is likely to match their operational experience.

The model

Since our cache works a bit like a big array where old entries are evicted to make place for new ones, we can use any data structure or implementation with first-in-first-out (FIFO) semantics as a model. This should be relatively accurate. We’ll use all of the callbacks of proper_statem to write our simpler FIFO structure to show whether the real cache works or not. Let’s set this up.

-module(prop_cache).
-include_lib("proper/include/proper.hrl").
-behaviour(proper_statem).
-export([command/1, initial_state/0, next_state/3,
         precondition/2, postcondition/3]).

-define(CACHE_SIZE, 10).

prop_test() ->
    ?FORALL(Cmds, commands(?MODULE),
        begin
            cache:start_link(?CACHE_SIZE),
            {History, State, Result} = run_commands(?MODULE, Cmds),
            cache:stop(),
            ?WHENFAIL(io:format("History: ~p\nState: ~p\nResult: ~p\n",
                                [History,State,Result]),
                      aggregate(command_names(Cmds), Result =:= ok))
        end).

We’ve used an arbitrary ?CACHE_SIZE value for the sake of simplicity. We could have used a generator for more thorough testing, but we’ll get the basics of stateful testing without that. Important to note is that the setup and teardown functions (cache:start_link/1 and cache:stop/0) always run as part of the property. Had we used the ?SETUP macro instead, we’d have needed to only call cache:flush() after every run to ensure it’s always empty. The current form is just a bit longer, and it provides enough setup and teardown requirements that it’ll do fine for our example.

Model state

For our model’s state, we’ll try to use as little data as possible, carrying only what’s strictly necessary to validate everything.

-record(state, {max=?CACHE_SIZE, count=0, entries=[]}).

%% Initial model value at system start. Should be deterministic.
initial_state() -> 
    #state{}.

The commands and preconditions

We’ll use a list to contain the model’s data, along with a count of how many entries were seen. The list will contain {Key, Value} pairs, and the counter will know when to drop pairs from the list, keeping the list as simple as possible.

The command generation is straightforward as well. We put emphasis on writes for the tests by using the frequency/1 generator.

We can then use the precondition to add constraints, like preventing calls to empty the cache when it’s already empty.

command(_State) -> 
    frequency([
        {1, {call, cache, find, [key()]}},
        {3, {call, cache, cache, [key(), val()]}}, 
        {1, {call, cache, flush, []}}
    ]).


precondition(#state{count=0}, {call, cache, flush, []}) -> 
    false; % don't flush an empty cache for no reason

precondition(#state{}, {call, _Mod, _Fun, _Args}) -> 
    true.

key() ->
    oneof([range(1,?CACHE_SIZE), % reusable keys, raising chance of dupes
           integer()]). % random keys

val() -> integer().

The generator for keys is designed to allow some keys to be repeated multiple times. By using a restricted set of keys (with the range/2 generator) along with an unrestricted integer(), we can ensure that some keys will be reused. This forces our property to exercise any code related to key reuse or matching, without losing the ability to fuzz the system with occasionally unexpected new keys.

The next_state

The next_state callback completes command generation by allowing the model to stay up-to-date with what the system state should be.

Get hands-on with 1400+ tech skills courses.