Collections Module in Python
Get introduced to Python's collections module and its datatypes.
We'll cover the following
Python’s collections module has specialized container datatypes that can
be used to replace Python’s general-purpose containers (dict
, tuple
, list
, and set
). We will be studying the following parts of this module:
- ChainMap
- counter
- defaultdict
- deque
- namedtuple
- OrderedDict
There is a sub-module of collections called abc
or Abstract Base Classes
. These will not be covered in this chapter.
Let’s get started with the ChainMap
container!
Overview of ChainMap
A ChainMap
is a class that provides the ability to link multiple
mappings together such that they end up being a single unit. If we look at the example below, we will notice that it accepts maps, which means that a ChainMap
will accept any number of mappings or dictionaries and turn them into a single view that we can update.
from collections import ChainMapcar_parts = {"hood": 500, "engine": 5000, "front_door": 750}car_options = {"A/C": 1000, "Turbo": 2500, "rollbar": 300}car_accessories = {"cover": 100, "hood_ornament": 150, "seat_cover": 99}car_pricing = ChainMap(car_accessories, car_options, car_parts)print(car_pricing["hood"])
Here we import ChainMap
from our collections module (Line 1).
Next we create three dictionaries (Lines 2-4). Then we create an instance of our ChainMap
by passing in the three dictionaries that we just created (Line 5). Finally, we try accessing one of the keys in our ChainMap
(Line 6). When we do this, the ChainMap
will go through each map in order to see if that key exists and has a
value. If it does, then the ChainMap
will return the first value it finds that matches the corresponding key.
This is especially useful if we want to set up defaults.
Let’s pretend
that we want to create an application that has some defaults. The
application will also be aware of the operating system’s environment
variables. If there is an environment variable that matches one of the
keys that we are defaulting to in our application, the environment will
override our default. Let’s further pretend that we can pass arguments
to our application. These arguments take precedence over the
environment and the defaults. This is one place where a ChainMap
can
really shine. Let’s look at a simple example that’s based on one from
Python’s documentation:
import argparse import os from collections import ChainMap def main(): app_defaults = {"username": "admin", "password": "admin"} parser = argparse.ArgumentParser() parser.add_argument("-u", "--username") parser.add_argument("-p", "--password") args = parser.parse_args() command_line_arguments = {key: value for key, value in vars(args).items() if value} chain = ChainMap(command_line_arguments, os.environ, app_defaults) print(chain["username"]) if __name__ == "__main__": main() os.environ["username"] = "test" main()
Let’s break this down a little. Here we import Python’s argparse
module along with the os
module (Line 1-2). We also import ChainMap
(Line 4). Next we
have a simple function that has some silly defaults. These defaults are used for some popular routers.
Then we set up our argument
parser and tell it how to handle certain command line options. We will notice that argparse
doesn’t provide a way to get a dictionary object of its arguments, so we use a dict comprehension to extract what we need.
The other cool piece here is the use of Python’s built-in vars
. If we want to call it without an argument, vars
would behave like Python’s built-in locals
. But if we do pass in an object, then vars
is equivalent to the object’s __dict__
property.
In other words, vars(args)
equals args.__dict__
. Finally
create our ChainMap
by passing in our command line arguments (if there
are any), then the environment variables and finally the defaults. At
the end of the code, we try calling our function, then setting an
environment variable and calling it again. Give it a try and we’ll see that it prints out admin
and then test
as expected.
Now let’s
try calling the script in the above terminal with a command line argument:
python3 chain_map.py -u mike
When we ran this, we got mike
back twice. This is because our command-line argument overrides everything else. It doesn’t matter that we set
the environment because our ChainMap
will look at the command line
arguments first before anything else.