Calculating input splits in MapReduce

1.2k Views Asked by TheCodeCache At 11 February 2018 at 18:33

A file is stored in HDFS of size 260 MB whereas the HDFS default block size is 64 MB. Upon performing a map-reduce job against this file, I found the number of input splits it creates is only 4. how did it calculated.? where is the rest 4 MB.? Any input is much appreciated.

Original Q&A

There are 1 best solutions below

Ronak Patel On 11 February 2018 at 20:37

Input split is NOT always a block size. Input split is a logical representation of data. Your input split could have been 63mb, 67mb, 65mb, 65mb(or possibly other sizes based on logical records' sizes) ... see examples in below links...

Hadoop input split size vs block size

Another example - see section 3.3...

Calculating input splits in MapReduce

There are 1 best solutions below

Related Questions in HADOOP

Related Questions in MAPREDUCE

Related Questions in HADOOP2

Related Questions in INPUT-SPLIT

Trending Questions

Popular # Hahtags

Popular Questions