Globally Limiting Concurrency
Explore how to implement global concurrency limits in Node.js asynchronous workflows by creating a TaskQueue class that manages task execution, handles errors, and signals when all tasks are complete. Understand techniques to prevent uncontrolled parallel processing and apply event-driven mechanisms for robust queue management.
We'll cover the following...
Our web spider application is perfect for applying what we just learned about limiting the concurrency of a set of tasks. In fact, to avoid the situation in which we have thousands of links being crawled at the same time, we can enforce a limit on the concurrency of this process by adding some predictability regarding the number of concurrent downloads.
We can apply this implementation of the limited concurrency pattern to our spiderLinks() function, but by doing that, we would only be limiting the concurrency of tasks spawned from the links found within a given page. If we choose, for example, a concurrency of two, we’ll have, at most, two links downloaded in parallel for each page. However, as we can download multiple links at once, each page will then spawn another two downloads, causing the grand total of download operations to grow exponentially anyway.
In general, this implementation of the limited concurrency pattern works very well when we have a predetermined set of tasks to execute, or when the set of tasks grows ...