Hadoop Hack: TaskTracker priority

A herd of African Bush Elephants in Serengeti ...

Changing the ‘niceness‘ of only one type of daemon

While recently playing with our cluster during a Terasort benchmark I realised just how dumb it was to leave everything running with the default level of niceness (0).  SSH sessions were timing out, reporting was going haywire, al-sorts of fun.

I know the Cloudera CDH3 distribution allows you to set a global HADOOP_NICENESS level, but I was hesitant about dropping the priority of essential tasks like the namenode/datanode/jobtracker.  However I wouldn’t mind if a tasktracker looses priority to a Ganglia ping – I want my pretty charts!

After poking around a bit I realised that the “/etc/hadoop/conf/hadoop-env.sh” script is included at a perfect spot within the start-up scripts for all daemons.  This allows you to detect what is starting and mess with its parameters.  Tada, the following addition to your own hadoop-env.sh will result in the tasktrackers being prioritised slightly less than everything else.

# This hack drops the priority of mapred tasks
if [ "$command" = "tasktracker" ]; then
export HADOOP_NICENESS=5
fi
Enhanced by Zemanta

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *