Using External Processes (I)
Learn how to prevent the event loop from blocking using external processes.
Deferring the steps of an algorithm isn’t the only option we have for running CPU-bound tasks. Another pattern for preventing the event loop from blocking is using child processes. We already know that Node.js gives its best when running I/O-intensive applications such as web servers, which allows us to optimize resource utilization thanks to its asynchronous architecture. So, the best way we have to maintain the responsiveness of an application is to not run expensive CPU-bound tasks in the context of the main application and, instead, use separate processes. This has three main advantages:
- The synchronous task can run at full speed, without the need to interleave the steps of its execution.
- Working with processes in Node.js is simple, probably easier than modifying an algorithm to use the
setImmediate()
method, and allows us to easily use multiple processors without the need to scale the main application itself. - If we really need maximum performance, the external process could be created in lower-level languages, such as good old C or more modern compiled languages like Go or Rust. Always use the best tool for the job!
Node.js has an ample toolbelt of APIs for interacting with external processes. We can find all we need in the child_process
module. Moreover, when the external process is just another Node.js program, connecting it to the main application is extremely easy and allows seamless communication with the local application. This magic happens thanks to the child_process.fork()
function, which creates a new child Node.js process and also automatically creates a communication channel with it, allowing us to exchange information using an interface very similar to the EventEmitter
interface. Let’s see how this works by refactoring our subset sum server again.
Delegating the subset sum task to an external process
The goal of refactoring the SubsetSum
task is to create a separate child process responsible for handling the synchronous processing, leaving the event loop of the main server free to handle requests coming from the network. Here’s the recipe we’re going to follow to make this possible:
- We’ll create a new
processPool.js
named module that’ll allow us to create a pool of running processes. Starting a new process is expensive and requires time, so keeping them constantly running and ready to handle requests allows us to save time and CPU cycles. Also, the pool will help us limit the number of processes running at the same time to prevent exposing the application to denial-of-service (DoS) attacks. - Next, we’ll create a module called
subsetSumFork.js
responsible for abstracting aSubsetSum
task running in a child process. Its role will be communicating with the child process and forwarding the results of the task as if they were coming from the current application. - Finally, we need a worker (our child process), a new Node.js program with the only goal of running the subset sum algorithm and forwarding its results to the parent process.
Note: The purpose of a DoS attack is to make a machine unavailable to its users. This is usually achieved by exhausting the capacity of such a machine by exploiting a vulnerability or massively overloading it with requests (DDoS—distributed DoS).
Implementing a process pool
Let’s start by building the processPool.js
module piece by piece.
Get hands-on with 1400+ tech skills courses.