0

I am using a system created by Koichi Shirahata, Hitoshi Sato, and Satoshi Matsuoka where they created a Hadoop-GPU system, which can be found here. This system uses Hadoop-0.20.1. Another user named millecker transferred the Hadoop-GPU system that I previously mentioned and transferred it to use Hadoop-1.0.3 (can be found here).

I want to do what something similar to what millecker did and transfer the work of Shirahata K. et al. to use Hadoop-2.6.0 instead of 1.0.3.

What are the required steps to migrate everything from Hadoop-0.20.1 to Hadoop-2.6.0 so that I can apply the Hadoop+GPU mix on Hadoop-2.6.0?

EDIT:

So I re-read the paper that Shirahata K. et al. wrote and according to their propoposed method, they changed the following classes:

  • MapTaskStatus.java
  • MapTask.java
  • Task.java
  • TaskGraphServlet.java
  • TaskInProgress.java
  • TaskReport.java
  • TaskStatus.java
  • TaskTracker.java
  • TaskTrackerStatus.java
  • JobConf.java
  • JobInProgress.java
  • JobQueueTaskScheduler.java
  • JobTracker.java

With that information, do I just need to edit the respective classes in the source code for Hadoop-2.6.0?

EDIT:

In the end, the authors explained that they edited the following files.

  • hadoop-gpu-0.20.1/src/docs/cn/src/documentation/sitemap.xmap differ
  • hadoop-gpu-0.20.1/src/docs/cn/uming.conf differ
  • hadoop-gpu-0.20.1/src/mapred/org/apache/hadoop/mapred/JobConf.java differ
  • hadoop-gpu-0.20.1/src/mapred/org/apache/hadoop/mapred/JobInProgress.java differ
  • hadoop-gpu-0.20.1/src/mapred/org/apache/hadoop/mapred/JobQueueTaskScheduler.java differ
  • hadoop-gpu-0.20.1/src/mapred/org/apache/hadoop/mapred/JobTracker.java differ
  • hadoop-gpu-0.20.1/src/mapred/org/apache/hadoop/mapred/MapTask.java differ
  • hadoop-gpu-0.20.1/src/mapred/org/apache/hadoop/mapred/MapTaskStatus.java differ
  • hadoop-gpu-0.20.1/src/mapred/org/apache/hadoop/mapred/TaskGraphServlet.java differ
  • hadoop-gpu-0.20.1/src/mapred/org/apache/hadoop/mapred/TaskInProgress.java differ
  • hadoop-gpu-0.20.1/src/mapred/org/apache/hadoop/mapred/Task.java differ
  • hadoop-gpu-0.20.1/src/mapred/org/apache/hadoop/mapred/TaskReport.java differ
  • hadoop-gpu-0.20.1/src/mapred/org/apache/hadoop/mapred/TaskStatus.java differ
  • hadoop-gpu-0.20.1/src/mapred/org/apache/hadoop/mapred/TaskTracker.java differ
  • hadoop-gpu-0.20.1/src/mapred/org/apache/hadoop/mapred/TaskTrackerStatus.java differ

  • hadoop-gpu-0.20.1/src/mapred/org/apache/hadoop/mapred/pipes/Application.java differ

  • hadoop-gpu-0.20.1/src/mapred/org/apache/hadoop/mapred/pipes/Submitter.java differ

I will test by amending those files and hopefully the migration will work.

1 Answers1

1

With that information, do I just need to edit the respective classes in the source code for Hadoop-2.6.0?

Not exactly. They made changes to version 0.20.1, which was released in September 2009. You want to merge those changes back into version 2.6.0, which was released in November 2014. Merging changes after 5 years of intense development is hard. While I haven't looked at your case specifically, at the very least those files have many minor improvements, and realistically most have been completely redesigned.

To successfully reintegrate the changes, you'll need to know how those classes fit into 0.20.1, how the changes interacted with them, and how all of their equivalents work in 2.6.0. Your changes will probably spread across far more of the codebase than you'd like. For instance, there's the new mapreduce api in addition to the old mapred api, which is even more distant from that of 0.20.1. YARN is now the resource manager of choice, and I doubt it handles GPUs very well in its current state.

I will test by amending those files and hopefully the migration will work.

Long story short, the hard part won't be applying patches, but understanding 5 years of changes. I strongly suggest you look into existing projects rather than doing this on your own.

Community
  • 1
  • 1
Gordon Gustafson
  • 40,133
  • 25
  • 115
  • 157
  • Yes, I understand what you meant after I tested my migration method last night. Before your post, I opted for your last approach, using an existing project. I don't want to deal with the usage of YARN, so I will use the 1.0.3 implementation already created by millecker and work my way up from there. – cocobutters Feb 24 '15 at 18:01