Unordered Parallel Execution
Learn how to parallelize execution by implementing a URL status monitoring application.
We'll cover the following...
We just saw that streams process each data chunk in sequence, but sometimes, this can be a bottleneck because it won’t let us make the most of the concurrency of Node.js. If we have to execute a slow asynchronous operation for every data chunk, it can be advantageous to parallelize the execution and speed up the overall process. Of course, this pattern can only be applied if there’s no relationship between each chunk of data, which might happen frequently for object streams, but very rarely for binary streams.
Caution: Unordered parallel streams can’t be used when the order in which the data is processed is important.
Let’s see how this works.
Implementing an unordered parallel stream
Let’s immediately demonstrate how to implement an unordered parallel stream with an example. Let’s create a module called parallel-stream.js
and define a generic Transform
stream that executes a given transform function in parallel.
import { Transform } from 'stream'export class ParallelStream extends Transform {constructor (userTransform, opts) { // (1)super({ objectMode: true, ...opts })this.userTransform = userTransformthis.running = 0this.terminateCb = null}_transform (chunk, enc, done) { // (2)this.running++this.userTransform(chunk,enc,this.push.bind(this),this._onComplete.bind(this))done()}_flush (done) { // (3)if (this.running > 0) {this.terminateCb = done} else {done()}}_onComplete (err) { // (4)this.running--if (err) {return this.emit('error', err)}if (this.running === 0) {this.terminateCb && this.terminateCb()}}}
Let’s analyze this new class step by step:
As we can see, the constructor accepts a
userTransform()
...