Globally Limiting Concurrency
Learn about the role of queues and events in globally limiting concurrency.
We'll cover the following...
Our web spider application is perfect for applying what we just learned about limiting the concurrency of a set of tasks. In fact, to avoid the situation in which we have thousands of links being crawled at the same time, we can enforce a limit on the concurrency of this process by adding some predictability regarding the number of concurrent downloads.
We can apply this implementation of the limited concurrency pattern to our spiderLinks()
function, but by doing that, we would only be limiting the concurrency of tasks spawned from the links found within a given page. If we choose, for example, a concurrency of two, we’ll have, at most, two links downloaded in parallel for each page. However, as we can download multiple links at once, each page will then spawn another two downloads, causing the grand total of download operations to grow exponentially anyway. ...