Google’s MapReduce
Learn how to design a system capable of processing gigantic amounts of data with a simple end-programmer interface.
Parallel data processing—the domain of a Jedi programmer!
Parallel computing is known to be difficult, intense, and full of potential minefields in terms of
Google introduced a new programming model (MapReduce) that enabled programmers (of any expertise level) to express their data processing needs as if they were writing sequential code. The MapReduce runtime automatically takes care of the messy details of distributing data and running the jobs in parallel on multiple servers, even under many fault conditions. The widespread use of the MapReduce model proves its applicability to a broad range of data processing problems.
In this lesson, we will study the MapReduce system’s design and programming model.
Design of MapReduce in a nutshell
MapReduce is a
The following illustration depicts how the process works. We will explain the design in the rest of the lesson.
Level up your interview prep. Join Educative to access 80+ hands-on prep courses.