Research into the implementation of Child JVM

This is  a post detailing the work I did to understand the designs of the child JVM and make possible plans of implementing a “compute server”. What is a “compute server”? It is one server that accepts Tasks from TaskTracker and execute them inside it. There is one compute server per node and it can execute multiple tasks on the node in parallel. Its maximum capacity would be directly related to the number of cores available and memory available.

The code snippet for RPC set up

final TaskUmbilicalProtocol umbilical =

taskOwner.doAs(new PrivilegedExceptionAction<TaskUmbilicalProtocol>() {


public TaskUmbilicalProtocol run() throws Exception {

return (TaskUmbilicalProtocol)RPC.getProxy(TaskUmbilicalProtocol.class,






The main loop of the main function for Child.Java. It gets a Task through the umbilical interface (RPC) calls. If it gets a task, it performs of a series of set ups on logs and other things to get ready for the execution of the task.

while (true) {

taskid = null;

currentJobSegmented = true;

JvmTask myTask = umbilical.getTask(context);

if (myTask.shouldDie()) {


} else {

if (myTask.getTask() == null) {

taskid = null;

currentJobSegmented = true;

if (++idleLoopCount >= SLEEP_LONGER_COUNT) {

//we sleep for a bigger interval when we don't receive

//tasks for a while


} else {






idleLoopCount = 0;

task = myTask.getTask();


taskid = task.getTaskID();

Once the set up is done, the following block of code is called to get the task started

// Create a final reference to the task for the doAs block

final Task taskFinal = task;

childUGI.doAs(new PrivilegedExceptionAction<Object>() {


public Object run() throws Exception {

try {

// use job-specified working directory

FileSystem.get(job).setWorkingDirectory(job.getWorkingDirectory());, umbilical);        // run the task

} finally {


(logLocation, taskid, isCleanup, logIsSegmented(job));

TaskLogsTruncater trunc = new TaskLogsTruncater(defaultConf);

trunc.truncateLogs(new JVMInfo(


taskFinal.isTaskCleanupTask()), Arrays.asList(taskFinal)));


return null;



The implementation of the umbilical protocol in TaskTracker is here. There are a lot of log information that is useful.


* Called upon startup by the child process, to fetch Task data.


public synchronized JvmTask getTask(JvmContext context)

throws IOException {


JVMId jvmId = context.jvmId;

LOG.debug("JVM with ID : " + jvmId + " asked for a task");

// save pid of task JVM sent by child


if (!jvmManager.isJvmKnown(jvmId)) {"Killing unknown JVM " + jvmId);

return new JvmTask(null, true);


RunningJob rjob = runningJobs.get(jvmId.getJobId());

if (rjob == null) { //kill the JVM since the job is dead"Killing JVM " + jvmId + " since job " + jvmId.getJobId() +

" is dead");

try {


} catch (InterruptedException e) {

LOG.warn("Failed to kill " + jvmId, e);


return new JvmTask(null, true);


TaskInProgress tip = jvmManager.getTaskForJvm(jvmId);

if (tip == null) {

return new JvmTask(null, false);


if (tasks.get(tip.getTask().getTaskID()) != null) { //is task still present"JVM with ID: " + jvmId + " given task: " +


return new JvmTask(tip.getTask(), false);

} else {"Killing JVM with ID: " + jvmId + " since scheduled task: " +

tip.getTask().getTaskID() + " is " + tip.taskStatus.getRunState());

return new JvmTask(null, true);



This entry was posted in Hadoop Research, HJ-Hadoop Improvements, Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s