Dependency Management: Python
In this lesson, we will discuss dependency management with Python.
We'll cover the following
Requirements.txt
Python has multiple ways of managing third-party dependencies. We’re going to take a look at managing dependencies using requirements.txt. Requirements.txt may spell out transitive dependencies, but it doesn’t have to. So we’ll start with a discussion of requirements.txt and what it can tell us. We’ll finish up with two approaches we can use if we don’t have all our transitive dependencies spelled out for us.
Finding dependencies in requirements.txt
The requirements.txt file has a couple of advantages from our point of view. First, it’s very straightforward to read. It’s a simple text file with one dependency per line. A second advantage is that a requirements.txt file can be generated automatically from a Python environment. When a requirements.txt file is generated this way, it specifies exact version numbers and includes all the transitive dependencies. This combination is exactly what we’re looking for when we’re hunting for dependencies with vulnerabilities.
An autogenerated requirements.txt file might look like this:
certifi==2017.11.5
chardet==3.0.4
idna==2.6
pipdeptree==0.10.1
requests==2.18.4
urllib3==1.22
Every library has an exact version number, and the transitive dependencies have all been pulled in. Perfect!
The wrinkle with requirements.txt files is that they don’t have to be generated this way. They can be generated by hand, and they can specify ranges of version numbers, not just exact version numbers.
So a requirements.txt could specify a dependency on a library with version >= 1.2.3. Installing with a requirements.txt file like this would install the newest version greater than or equal to 1.2.3. However, at the time of deployment, that might have meant version 1.2.4. At the time of an investigation, that could mean 1.2.8. What are the differences between 1.2.4 and 1.2.8? Who knows? 1.2.8 could have fixed old vulnerabilities, introduced new vulnerabilities, or both.
A hand-edited requirements.txt file might look like this:
certifi==2017.11.5
chardet==3.0.4
idna==2.6
pipdeptree==0.10.1
requests>=2
urllib3==1.22
Note the version specified for the requests library. In this case, we don’t know what version of the requests library is installed on any given install of our program. It would depend on the latest version available at deploy time.
Finding dependencies in an installed instance
If we investigate a Python project that doesn’t give us exact version numbers for each dependency, we don’t have a way to find out the version numbers that are used in practice just by looking at the files checked into source control. We’ll have to look at a deployed instance instead. The specifics of this will depend on the deployment environment and the install process used.
One option for investigating the deployed libraries is to use pip. If pip is installed, running the command pip freeze
(pip3 freeze
if you are using python3):
pip3 freeze
The output may look familiar. It’s the same as the ideal requirements.txt file we looked at in the previous section. Just be sure to use the pip executable that corresponds to the Python executable actually used in production.
A second option for investigating deployed libraries is to look into the site-packages directory of the Python install that’s used in production. There will be a directory for each dependency, both direct and transitive. As with using pip, it’s important to find the Python install that’s used in practice.
Q U I Z
What is a possible problem with requirements.txt
?
They may not spell out all the transitive dependencies.
They may state ranges of version numbers instead of precise version numbers.
They are difficult to read and understand.
In the next lesson, we’ll study dependency management with JavaScript.