Zero-Copy
Learn about Zero-Copy in Python.
We'll cover the following...
Often programs have to deal with an enormous amount of data in the form of large arrays of bytes. Handling such a massive amount of data in strings can be very ineffective once you start manipulating it through copying, slicing and modifying.
Memory profiler
Let’s consider a small program that reads a large file of binary data, and partially copies it into another file. To examine our memory usage, we will use memory_profiler, a nice Python package that allows us to see the memory usage of a program line by line.
To run the below code, click on the Run button and use command
python -m memory_profiler memoryview-copy.py
to run thememory_profiler
.
@profile def read_random(): with open("/dev/urandom", "rb") as source: content = source.read(1024 * 10000) content_to_write = content[1024:] print("Content length: %d, content to write length %d" % (len(content), len(content_to_write))) with open("/dev/null", "wb") as target: target.write(content_to_write) if __name__ == '__main__': read_random()
-
In line 4, we are reading 10 MB from
/dev/urandom
and not doing much with it. Python needs to allocate around 10 MB of memory to store this data as a string. -
In line 5, we copy the entire block of data minus the first kilobyte – because we won’t be writing those first 1024 bytes to the target file.
What is interesting in this example is that, as you can see, the memory usage of the program is increased by about 10 MB when building the variable content_to_write
. In fact, the slice operator is copying the entirety of content, minus the first KB, ...