The Duper Application

Learn about servers in a supervised application in Elixir.

We’ll start by creating a supervised application:

$ mix new --sup duper
$ cd duper
$ git init
$ git add .
$ git commit -a -m 'raw application' 

Time to start writing servers.

The Results server

The Results server wraps an Elixir map. When it starts, it sets its state to an empty map. The keys of this map are hash values, and the values are the list of one of more paths whose files have that hash.

The server provides two API calls: one is to add a hash/path pair to the map, and the second is to retrieve entries that have more than one path in the value (because these are two duplicate files).

This is similar to the code we wrote for the sequence stash:

Press + to interact
defmodule Duper.Results do
use GenServer
@me __MODULE__
# API
def start_link(_) do
GenServer.start_link(__MODULE__, :no_args, name: @me)
end
def add_hash_for(path, hash) do
GenServer.cast(@me, { :add, path, hash })
end
def find_duplicates() do
GenServer.call(@me, :find_duplicates)
end
# Server
def init(:no_args) do
{ :ok, %{} }
end
def handle_cast({ :add, path, hash }, results) do
results =
Map.update(
results, # look in this map
hash, # for an entry with key
[ path ], # if not found, store this value
fn existing -> # else update with result of this fn
[ path | existing ]
end)
{ :noreply, results }
end
def handle_call(:find_duplicates, _from, results) do
{
:reply,
hashes_with_more_than_one_path(results),
results
}
end
defp hashes_with_more_than_one_path(results) do
results
|> Enum.filter(fn { _hash, paths } -> length(paths) > 1 end)
|> Enum.map(&elem(&1, 1))
end
end

Note the use of Map.update. This wonderful function takes a map, a key, an initial value, and a function. If the key isn’t present in the map, then a new map is returned with that key and initial value added. If the key is present, then the corresponding value is passed to the function, and whatever the function returns becomes the updated value in the returned map. In our case, we’re using it to create a single-element path list the first time a hash is encountered and then to add paths to that list on duplicates. We’ll add this server to the list of top-level children in application.ex.

def start(_type, _args) do 
  children = [
    Duper.Results,
  ]
  opts = [strategy: :one_for_one, name: Duper.Supervisor]
  Supervisor.start_link(children, opts)
end

This code is easy to test:

defmodule Duper.ResultsTest do
  use ExUnit.Case
  alias Duper.Results
  
  test "can add entries to the results" do

    Results.add_hash_for("path1", 123)
    Results.add_hash_for("path2", 456)
    Results.add_hash_for("path3", 123)
    Results.add_hash_for("path4", 789)
    Results.add_hash_for("path5", 456)
    Results.add_hash_for("path6", 999)
    
    duplicates = Results.find_duplicates()

    assert length(duplicates) == 2

    assert ~w{path3 path1} in duplicates
    assert ~w{path5 path2} in duplicates
  end
  
end

Press + to interact
$ mix test
...
Finished in 0.05 seconds
1 test, 0 failures

The PathFinder server

Our next server is responsible for returning all the file paths in a filesystem tree, one at a time. Elixir doesn’t have a filesystem traversal API ...