Passing JVM options into Child JVM of Hadoop

This is work done with Hadoop 1.0.3, I am trying to add the option of displaying garbage collection to the Child JVMs created by Hadoop TaskTracker that executes the map/reduce tasks.

There is a way to do this without modifying the source code (you could do this by modifying src/mapred/org/apache/hadoop/mapred/TaskRunner.java. This is the file that manages the parsing of arguments to the JVM)

You need to modify the file in

hadoop-1.0.3/conf/mapred-site.xml

Find the Xml block  for

<property>
<name>mapred.child.java.opts</name><value>-J-Xmx1536m -J-verbose:gc</value>
</property>

Here the -J prefix is for Habanero Java Runtime we used to replace standard Java runtime in our research project, you can use -Xmx, -verbose here. The content of java.opts from mapred-site.xml will be read  by TaskRunner class and parsed and finally inputted to the JVM. This way, you can modify JVM options without recompiling hadoop source code. An example is also given int he TaskRunner class.

A caveat here is be ware of spaces when specifying java.opts

I had a lot of problem when I left a trailing space behind like

</name><value>-J-Xmx1536m -J-verbose:gc(space) </value>

This could be HJ runtime issue, or just a parsing issue in TaskRunner, not sure.

Cheers

Advertisements
This entry was posted in Hadoop Research, HJ-Hadoop Improvements, Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s