I am looking out for the mapreduce program to read from one hive table and write to hdfs location of first column value of each record. And it should contain only map phase not reducer phase.
Below is the mapper
public class Map extends Mapper<WritableComparable, HCatRecord, NullWritable, IntWritable> {
protected void map( WritableComparable key,
HCatRecord value,
org.apache.hadoop.mapreduce.Mapper<WritableComparable, HCatRecord,
NullWritable, IntWritable>.Context context)
throws IOException, InterruptedException {
// The group table from /etc/group has name, 'x', id
// groupname = (String) value.get(0);
int id = (Integer) value.get(1);
// Just select and emit the name and ID
context.write(null, new IntWritable(id));
}
}
Main class
public class mapper1 {
public static void main(String[] args) throws Exception {
mapper1 m=new mapper1();
m.run(args);
}
public void run(String[] args) throws IOException, Exception, InterruptedException {
Configuration conf =new Configuration();
// Get the input and output table names as arguments
String inputTableName = args[0];
// Assume the default database
String dbName = "xademo";
Job job = new Job(conf, "UseHCat");
job.setJarByClass(mapper1.class);
HCatInputFormat.setInput(job, dbName, inputTableName);
job.setMapperClass(Map.class);
// An HCatalog record as input
job.setInputFormatClass(HCatInputFormat.class);
// Mapper emits a string as key and an integer as value
job.setMapOutputKeyClass(NullWritable.class);
job.setMapOutputValueClass(IntWritable.class);
FileOutputFormat.setOutputPath((JobConf) conf, new Path(args[1]));
job.waitForCompletion(true);
}
}
Is there anything wrong in this code?
This is giving some error as Numberformat exception from string 5s. I am not sure where it is being taken from. Showing error at below line HCatInputFormat.setInput()