Example Case: Quicksort
Take a look at the use of targeted properties to weed out a suspected bug in a quicksort function.
We'll cover the following
Understanding the example case
Quicksort is a fast sorting algorithm, especially in imperative languages where in-place sorting can be done to save on memory. It’s also a frequently used algorithms when demonstrating list comprehensions. The official Erlang documentation gives an example that looks like this:
sort([]) -> [];
sort([Pivot|T]) ->
sort([ X || X <- T, X < Pivot])
++ [Pivot] ++
sort([ X || X <- T, X >= Pivot]).
It’s a straightforward implementation that works rather well, but there’s a reason why the Erlang standard library uses a mergesort
implementation instead. Quicksort has notoriously bad behavior when bad pivots or repeated elements are present. It starts to use quadratically more time on every sorted element. Mergesort, by comparison, doesn’t suffer from that behavior.
We’ll devise an experiment to see whether the quicksort implementation proposed by Erlang is safe or not, to see how applying targeted properties can find fascinating stuff in our programs.
Establishing a baseline
To find an exponential case in time, we’ll need to set up properties that measure how long sorting a list takes. We want to maximize running time. The longer the function takes to run, the better. It’s hard to imagine this would be easy to tweak in a manually written custom generator, especially if we don’t know what to look for. Targeted properties can do a good job there, but we’ll first want to establish a baseline for the experiment to show whether they can really help us.
Let’s start with a regular property that has been defined in the code below as prop_quicksort_time_regular
.
Let’s establish another comparison point: Will mergesort
survive our experiment? Next we have a similar targeted property that uses lists:sort/1
, which implements mergesort
. It is defined in the code below as prop_mergesort_time
at line 21.
We’ve added a max length of 100,000 elements to the list because it would probably be effortless to find problematic cases when they contain a billion elements. Since we’re looking for some exponential time, it should not, in theory, take that many elements to trigger it.
Let’s compare the time of both our properties.
Note: After updating the code, press “Run” to save the code and then call
rebar3 proper
to run the updated tests.
Get hands-on with 1300+ tech skills courses.