Mapper Input

This lesson explains concepts relating to input data for map tasks.

We'll cover the following...

Mapper Inputs

Input splits

Our example demonstrates a simplistic scenario where the input is entirely contained in a single file for the MR job. In reality, the input to a MR job usually consists of several GBs of data. That data is split among multiple map tasks.

Each map task works on a unit of data called the input split.

Hadoop divides the MR job input into equal sized chunks. Each map task works on one chunk - the input split. A user can tweak the size of the input split. As a ...

Access this course and 1400+ top-rated courses and projects.