I have a small project with MapReduce and since I am new with this I am running into a lot of difficulties so would appreciate the help. In this project, I have a file that contains the nation, year, and weight. I want to find for each nation's year follows the weight. This is my data
USA, 2019; 0.7
USA, 2020; 0.3
USA, 2021; 0.9
Canada, 2019; 0.6
Canada, 2020; 0.3
the mapper
def idf_country(self, key, values):
nation, year = key[0], key[1]
weight = values
yield nation, (year, weight)
This is what I am trying to get
USA 2019, 0.7; 2020, 0.3; 2021, 0.9
Canada 2019, 0.6; 2020, 0.3
Your mapper reads each line of the file. You need to split the line, not use the key
Then the reducer will already be grouped by the nation, so you can just rejoin the values