Hadoop workload

2.1k Views Asked by sethu At 21 April 2011 at 19:04

I am currently using wordcount application in hadoop as a benchmark. I find that the cpu usage is fairly nearly constant around 80-90%. I would like to have a fluctuating cpu usage. Is there any hadoop application that can give me this capability? Thanks a lot.

Original Q&A

There are 1 best solutions below

QuinnG On 21 April 2011 at 20:22 BEST ANSWER

I don't think there's a way to throttle or specify a range for hadoop to use. Hadoop will use the CPU available to it. When I'm running a lot of jobs, I'm constantly in the 90%+ range.

One way you can control the CPU usage is to change the maximum number of mappers/reducers each tasktracker can run simultaneously. This is done through the mapred.tasktracker.{map|reduce}.tasks.maximum setting in $HADOOP_HOME/conf/core-site.xml.

It will use less CPU on that tasktracker when the number of mapper/reducers is limited.

Another way is to set the configuration value for mapred.tasktracker.{map|reduce}.tasks when setting up the job. This will force that job to use that many mappers/reducers. This number will be split across the available tasktrackers, so if you have 4 nodes and want each node to have 1 mapper you'd set mapred.tasktracker.map.tasks to 4. It's also possible that if a node can run 4 mappers, it will run all 4, I don't know exactly how hadoop will split out the tasks, but forcing a number, per job, is an option.

I hope that helps get you to where you're going. I still don't quite understand what you are looking for. :)

Hadoop workload

There are 1 best solutions below

Related Questions in HADOOP

Related Questions in CPU

Related Questions in WORKLOAD

Trending Questions

Popular # Hahtags

Popular Questions