Exercise: Stream Data Processing
Explore how to implement stream data processing in Node.js by analyzing London crime data using pipelines and Transform streams. Learn to answer questions about crime trends, dangerous areas, and crime frequencies by applying modular asynchronous stream techniques.
We'll cover the following...
Problem statement
On Kaggle, you can find a lot of interesting data sets, such as the London Crime Data. You can download the data in
Did the number of crimes go up or down over the years?
What are the most dangerous areas of London?
What’s the most common crime per area?
What’s the least common crime?
If you’re unsure about how to do this, click the “Show Hint” button.
Coding challenge
Write your solution code in the following code widget. We’ve already added the package.json and london_crime_by_lsoa.csv files for your ease.
Note: We’ve extracted first 500 records from the
london_crime_by_lsoa.csvfile for your ease.
// Write your code here
Solution
Here’s the solution to the above problem. You can go through it by executing the following command.
import { Analyzer } from './analyzer.js'
export class LeastCommon extends Analyzer {
_transform(chunk, encoding, callback) {
const currValue = +chunk.value
if (currValue && !isNaN(currValue)) {
const currArea = +this.map.get(chunk.borough) || new Map()
const totalValue = +currArea.get(chunk.major_category) || 0
currArea.set(chunk.major_category, currValue + totalValue)
this.map.set(chunk.borough, currArea)
}
callback()
}
_flush(callback) {
this.result = ""
for (const area of this.map.keys()) {
this.result += `===> Area: ${area}\n`
const crimes = this.map.get(area)
// sorting the array
const sortedCrimes = Array.from(crimes.entries())
.sort((a, b) => a[1] - b[1])
this.result += sortedCrimes.join(" | ")
this.result += "\n\n"
}
callback()
}
}Explanation
In this code, we’re implementing stream data processing on data using pipeline and ...