Mapper Input
This lesson explains concepts relating to input data for map tasks.
We'll cover the following...
Mapper Inputs
Input splits
Our example demonstrates a simplistic scenario where the input is entirely contained in a single file for the MR job. In reality, the input to a MR job usually consists of several GBs of data. That data is split among multiple map tasks.
Each map task works on a unit of data called the input split.
Hadoop divides the MR job input into equal sized chunks. Each map task works on one chunk - the input split. A user can tweak the size of the input split. As a ...
Access this course and 1400+ top-rated courses and projects.