Creating a Pool

Let's discuss how we can create a pool using concurrent.futures module and its improvement using map method.

We'll cover the following...

Creating a pool of workers is extremely easy when you’re using the concurrent.futures module. Let’s start out with the code of asyncio and now use the concurrent.futures module in it. Here’s our version:

Press + to interact
import os
import urllib.request
from concurrent.futures import ThreadPoolExecutor
from concurrent.futures import as_completed
def downloader(url):
"""
Downloads the specified URL and saves it to disk
"""
req = urllib.request.urlopen(url)
filename = os.path.basename(url)
ext = os.path.splitext(url)[1]
if not ext:
raise RuntimeError('URL does not contain an extension')
with open(filename, 'wb') as file_handle:
while True:
chunk = req.read(1024)
if not chunk:
break
file_handle.write(chunk)
msg = 'Finished downloading {filename}'.format(filename=filename)
return msg
def main(urls):
"""
Create a thread pool and download specified urls
"""
with ThreadPoolExecutor(max_workers=5) as executor:
futures = [executor.submit(downloader, url) for url in urls]
for future in as_completed(futures):
print(future.result())
if __name__ == '__main__':
urls = ["http://www.irs.gov/pub/irs-pdf/f1040.pdf",
"http://www.irs.gov/pub/irs-pdf/f1040a.pdf",
"http://www.irs.gov/pub/irs-pdf/f1040ez.pdf",
"http://www.irs.gov/pub/irs-pdf/f1040es.pdf",
"http://www.irs.gov/pub/irs-pdf/f1040sb.pdf"]
main(urls)

First off we do the imports that we need. Then we create our downloader function. We went ahead and updated it slightly so it checks to see if the URL has an extension on the end of it. If ...