FamousPhil.com -- Home My Calendar Youtube LinkedIn Facebook MySpace Twitter RSS Blog Feed

Blog Navigation

Recommended

Latest Activity

Scaling a SNMP Version 3 trap receiver using Java

Phil explains how to write a scalable SNMP Trap and Inform message receiver in Java using SNMP4J. He also explains what SNMP is and surrounding ideas such as TCP and UDP.



A Hadoop MapReduce Solution to Dijkstra’s Algorithm

I’m willing to bet that someone who reads my blog thinks that I forgot about the 2nd part of my original blog post (Installing Hadoop, Located here).  If you thought that I forgot, think again!  Just as a recap, I left off with a fully functional installation of Hadoop and Eclipse support on my last blog.  The goal of this blog is to implement and run Dijkstra’s shortest path algorithm on the local Hadoop installation.

Dijkstra’s algorithm is a solution to the shortest path problem in graph theory.  Given several vertices (or nodes) and several edges (or arcs), find the shortest path (or the least amount of vertices to traverse across) from a starting node to an ending node.  My goal is to solve the presented problem (below) from nodes 1 and 5.  Intuition says that the shortest path length is 2 using node 2 as a hop.  Below is a picture of a sample graph that I will have my Hadoop installation solve.

Dijkstra Graph Example Input

Dijkstra Graph Example Input

First, we need to start the Hadoop daemons and enter Eclipse as described in the previous blog.  Next, we will create a new “other” project in Eclipse.  Select “MapReduce Project” and enter a suitable name (in my case, I entered Dijkstra).  I then clicked finish, I clicked yes to the prompt that asked me to switch to the MapReduce perspective.  Next, verify that the following libraries are in the build path of Eclipse (or listed in the project), if they aren’t present, add them from the build path, you will find them in the Hadoop installation source folder and the lib sub-folder.

hadoop-0.20.2-core.jar
commons-cli-1.2.jar
commons-logging-1.0.4.jar

Next, right click on the src folder, goto new > class, enter a class name (I used Dijkstra), and a package name (I used graph), then click finish.

The following code should be pasted into the entire java file.

package graph;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.HashMap;
import java.util.Iterator;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.mapreduce.lib.input.*;
import org.apache.hadoop.mapreduce.lib.output.*;
import org.apache.hadoop.util.*;

public class Dijkstra extends Configured implements Tool {

	public static String OUT = "outfile";
	public static String IN = "inputlarger";

	public static class TheMapper extends Mapper<LongWritable, Text, LongWritable, Text> {

		public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
			//From slide 20 of Graph Algorithms with MapReduce (by Jimmy Lin, Univ @ Maryland)
			//Key is node n
			//Value is D, Points-To
			//For every point (or key), look at everything it points to.
			//Emit or write to the points to variable with the current distance + 1
			Text word = new Text();
			String line = value.toString();//looks like 1 0 2:3:
			String[] sp = line.split(" ");//splits on space
			int distanceadd = Integer.parseInt(sp[1]) + 1;
			String[] PointsTo = sp[2].split(":");
			for(int i=0; i<PointsTo.length; i++){
				word.set("VALUE "+distanceadd);//tells me to look at distance value
				context.write(new LongWritable(Integer.parseInt(PointsTo[i])), word);
				word.clear();
			}
			//pass in current node's distance (if it is the lowest distance)
			word.set("VALUE "+sp[1]);
			context.write( new LongWritable( Integer.parseInt( sp[0] ) ), word );
			word.clear();

			word.set("NODES "+sp[2]);//tells me to append on the final tally
			context.write( new LongWritable( Integer.parseInt( sp[0] ) ), word );
			word.clear();

		}
	}

	public static class TheReducer extends Reducer<LongWritable, Text, LongWritable, Text> {
		public void reduce(LongWritable key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
			//From slide 20 of Graph Algorithms with MapReduce (by Jimmy Lin, Univ @ Maryland)
			//The key is the current point
			//The values are all the possible distances to this point
			//we simply emit the point and the minimum distance value

			String nodes = "UNMODED";
			Text word = new Text();
			int lowest = 10009;//start at infinity

			for (Text val : values) {//looks like NODES/VALUES 1 0 2:3:, we need to use the first as a key
				String[] sp = val.toString().split(" ");//splits on space
				//look at first value
				if(sp[0].equalsIgnoreCase("NODES")){
					nodes = null;
					nodes = sp[1];
				}else if(sp[0].equalsIgnoreCase("VALUE")){
					int distance = Integer.parseInt(sp[1]);
					lowest = Math.min(distance, lowest);
				}
			}
			word.set(lowest+" "+nodes);
			context.write(key, word);
			word.clear();
		}
	}

	//Almost exactly from http://hadoop.apache.org/mapreduce/docs/current/mapred_tutorial.html
	public int run(String[] args) throws Exception {
		//http://code.google.com/p/joycrawler/source/browse/NetflixChallenge/src/org/niubility/learning/knn/KNNDriver.java?r=242
		getConf().set("mapred.textoutputformat.separator", " ");//make the key -> value space separated (for iterations)

		//set in and out to args.
		IN = args[0];
		OUT = args[1];

		String infile = IN;
		String outputfile = OUT + System.nanoTime();

		boolean isdone = false;
		boolean success = false;

		HashMap <Integer, Integer> _map = new HashMap<Integer, Integer>();

		while(isdone == false){

			Job job = new Job(getConf());
			job.setJarByClass(Dijkstra.class);
			job.setJobName("Dijkstra");
			job.setOutputKeyClass(LongWritable.class);
			job.setOutputValueClass(Text.class);
			job.setMapperClass(TheMapper.class);
			job.setReducerClass(TheReducer.class);
			job.setInputFormatClass(TextInputFormat.class);
			job.setOutputFormatClass(TextOutputFormat.class);

			FileInputFormat.addInputPath(job, new Path(infile));
			FileOutputFormat.setOutputPath(job, new Path(outputfile));

			success = job.waitForCompletion(true);

			//remove the input file
			//http://eclipse.sys-con.com/node/1287801/mobile
			if(infile != IN){
				String indir = infile.replace("part-r-00000", "");
				Path ddir = new Path(indir);
				FileSystem dfs = FileSystem.get(getConf());
				dfs.delete(ddir, true);
			}

			infile = outputfile+"/part-r-00000";
			outputfile = OUT + System.nanoTime();

			//do we need to re-run the job with the new input file??
			//http://www.hadoop-blog.com/2010/11/how-to-read-file-from-hdfs-in-hadoop.html
			isdone = true;//set the job to NOT run again!
			Path ofile = new Path(infile);
			FileSystem fs = FileSystem.get(new Configuration());
			BufferedReader br=new BufferedReader(new InputStreamReader(fs.open(ofile)));

			HashMap<Integer, Integer> imap = new HashMap<Integer, Integer>();
			String line=br.readLine();
			while (line != null){
				//each line looks like 0 1 2:3:
				//we need to verify node -> distance doesn't change
				String[] sp = line.split(" ");
				int node = Integer.parseInt(sp[0]);
				int distance = Integer.parseInt(sp[1]);
				imap.put(node, distance);
				line=br.readLine();
			}
			if(_map.isEmpty()){
				//first iteration... must do a second iteration regardless!
				isdone = false;
			}else{
				//http://www.java-examples.com/iterate-through-values-java-hashmap-example
				//http://www.javabeat.net/articles/33-generics-in-java-50-1.html
				Iterator<Integer> itr = imap.keySet().iterator();
				while(itr.hasNext()){
					int key = itr.next();
					int val = imap.get(key);
					if(_map.get(key) != val){
						//values aren't the same... we aren't at convergence yet
						isdone = false;
					}
				}
			}
			if(isdone == false){
				_map.putAll(imap);//copy imap to _map for the next iteration (if required)
			}
		}

		return success ? 0 : 1;
	}

	public static void main(String[] args) throws Exception {
		System.exit(ToolRunner.run(new Dijkstra(), args));
	}
}

In addition, right click on the main project, and click new > file.  Name the file input and place the input information into it (below).

1 0 2:3:
2 10000 1:4:5:
3 10000 1:
4 10000 2:5:
5 10000 2:4:

Note (for the below paragraph): I didn’t have success with the “run on hadoop” option due to the age of this plugin (it hasn’t been updated to work on Eclipse as of the time of this writing).  At one time, you could easily run the application on a Hadoop cluster directly from Eclipse.

Next, once the code is in place, we can test it by right clicking Dijkstra.java, going to run as, and clicking Run Configurations.  The above program requires arguments (input) to be passed in for the input and output files that Hadoop will use.  On the arguments tab, we should enter “input output” (without quotes) in the Program arguments box.  We can then click Apply, then run.  The application will run in local development mode.  Once it’s done, you should see something like the following:

11/06/07 19:03:33 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
11/06/07 19:03:33 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
11/06/07 19:03:33 INFO input.FileInputFormat: Total input paths to process : 1
11/06/07 19:03:34 INFO mapred.JobClient: Running job: job_local_0001
11/06/07 19:03:34 INFO input.FileInputFormat: Total input paths to process : 1
11/06/07 19:03:34 INFO mapred.MapTask: io.sort.mb = 100
11/06/07 19:03:34 INFO mapred.MapTask: data buffer = 79691776/99614720
11/06/07 19:03:34 INFO mapred.MapTask: record buffer = 262144/327680
11/06/07 19:03:34 INFO mapred.MapTask: Starting flush of map output
11/06/07 19:03:34 INFO mapred.MapTask: Finished spill 0
11/06/07 19:03:34 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting
11/06/07 19:03:34 INFO mapred.LocalJobRunner:
11/06/07 19:03:34 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000000_0' done.
11/06/07 19:03:34 INFO mapred.LocalJobRunner:
11/06/07 19:03:34 INFO mapred.Merger: Merging 1 sorted segments
11/06/07 19:03:34 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 425 bytes
11/06/07 19:03:34 INFO mapred.LocalJobRunner:
11/06/07 19:03:35 INFO mapred.TaskRunner: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting
11/06/07 19:03:35 INFO mapred.LocalJobRunner:
11/06/07 19:03:35 INFO mapred.TaskRunner: Task attempt_local_0001_r_000000_0 is allowed to commit now
11/06/07 19:03:35 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to output2772721873153
11/06/07 19:03:35 INFO mapred.LocalJobRunner: reduce > reduce
11/06/07 19:03:35 INFO mapred.TaskRunner: Task 'attempt_local_0001_r_000000_0' done.
11/06/07 19:03:35 INFO mapred.JobClient:  map 100% reduce 100%
11/06/07 19:03:35 INFO mapred.JobClient: Job complete: job_local_0001
11/06/07 19:03:35 INFO mapred.JobClient: Counters: 12
11/06/07 19:03:35 INFO mapred.JobClient:   FileSystemCounters
11/06/07 19:03:35 INFO mapred.JobClient:     FILE_BYTES_READ=27515
11/06/07 19:03:35 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=55435
11/06/07 19:03:35 INFO mapred.JobClient:   Map-Reduce Framework
11/06/07 19:03:35 INFO mapred.JobClient:     Reduce input groups=5
11/06/07 19:03:35 INFO mapred.JobClient:     Combine output records=0
11/06/07 19:03:35 INFO mapred.JobClient:     Map input records=5
11/06/07 19:03:35 INFO mapred.JobClient:     Reduce shuffle bytes=0
11/06/07 19:03:35 INFO mapred.JobClient:     Reduce output records=5
11/06/07 19:03:35 INFO mapred.JobClient:     Spilled Records=40
11/06/07 19:03:35 INFO mapred.JobClient:     Map output bytes=383
11/06/07 19:03:35 INFO mapred.JobClient:     Combine input records=0
11/06/07 19:03:35 INFO mapred.JobClient:     Map output records=20
11/06/07 19:03:35 INFO mapred.JobClient:     Reduce input records=20
11/06/07 19:03:35 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
11/06/07 19:03:35 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
11/06/07 19:03:35 INFO input.FileInputFormat: Total input paths to process : 1
11/06/07 19:03:35 INFO mapred.JobClient: Running job: job_local_0002
11/06/07 19:03:35 INFO input.FileInputFormat: Total input paths to process : 1
11/06/07 19:03:35 INFO mapred.MapTask: io.sort.mb = 100
11/06/07 19:03:35 INFO mapred.MapTask: data buffer = 79691776/99614720
11/06/07 19:03:35 INFO mapred.MapTask: record buffer = 262144/327680
11/06/07 19:03:35 INFO mapred.MapTask: Starting flush of map output
11/06/07 19:03:35 INFO mapred.MapTask: Finished spill 0
11/06/07 19:03:35 INFO mapred.TaskRunner: Task:attempt_local_0002_m_000000_0 is done. And is in the process of commiting
11/06/07 19:03:35 INFO mapred.LocalJobRunner:
11/06/07 19:03:35 INFO mapred.TaskRunner: Task 'attempt_local_0002_m_000000_0' done.
11/06/07 19:03:35 INFO mapred.LocalJobRunner:
11/06/07 19:03:35 INFO mapred.Merger: Merging 1 sorted segments
11/06/07 19:03:35 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 401 bytes
11/06/07 19:03:35 INFO mapred.LocalJobRunner:
11/06/07 19:03:35 INFO mapred.TaskRunner: Task:attempt_local_0002_r_000000_0 is done. And is in the process of commiting
11/06/07 19:03:35 INFO mapred.LocalJobRunner:
11/06/07 19:03:35 INFO mapred.TaskRunner: Task attempt_local_0002_r_000000_0 is allowed to commit now
11/06/07 19:03:35 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0002_r_000000_0' to output2774090394729
11/06/07 19:03:35 INFO mapred.LocalJobRunner: reduce > reduce
11/06/07 19:03:35 INFO mapred.TaskRunner: Task 'attempt_local_0002_r_000000_0' done.
11/06/07 19:03:36 INFO mapred.JobClient:  map 100% reduce 100%
11/06/07 19:03:36 INFO mapred.JobClient: Job complete: job_local_0002
11/06/07 19:03:36 INFO mapred.JobClient: Counters: 12
11/06/07 19:03:36 INFO mapred.JobClient:   FileSystemCounters
11/06/07 19:03:36 INFO mapred.JobClient:     FILE_BYTES_READ=55621
11/06/07 19:03:36 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=111095
11/06/07 19:03:36 INFO mapred.JobClient:   Map-Reduce Framework
11/06/07 19:03:36 INFO mapred.JobClient:     Reduce input groups=5
11/06/07 19:03:36 INFO mapred.JobClient:     Combine output records=0
11/06/07 19:03:36 INFO mapred.JobClient:     Map input records=5
11/06/07 19:03:36 INFO mapred.JobClient:     Reduce shuffle bytes=0
11/06/07 19:03:36 INFO mapred.JobClient:     Reduce output records=5
11/06/07 19:03:36 INFO mapred.JobClient:     Spilled Records=40
11/06/07 19:03:36 INFO mapred.JobClient:     Map output bytes=359
11/06/07 19:03:36 INFO mapred.JobClient:     Combine input records=0
11/06/07 19:03:36 INFO mapred.JobClient:     Map output records=20
11/06/07 19:03:36 INFO mapred.JobClient:     Reduce input records=20
11/06/07 19:03:36 INFO jvm.JvmMetrics: Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized
11/06/07 19:03:36 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
11/06/07 19:03:36 INFO input.FileInputFormat: Total input paths to process : 1
11/06/07 19:03:36 INFO mapred.JobClient: Running job: job_local_0003
11/06/07 19:03:36 INFO input.FileInputFormat: Total input paths to process : 1
11/06/07 19:03:36 INFO mapred.MapTask: io.sort.mb = 100
11/06/07 19:03:36 INFO mapred.MapTask: data buffer = 79691776/99614720
11/06/07 19:03:36 INFO mapred.MapTask: record buffer = 262144/327680
11/06/07 19:03:36 INFO mapred.MapTask: Starting flush of map output
11/06/07 19:03:36 INFO mapred.MapTask: Finished spill 0
11/06/07 19:03:36 INFO mapred.TaskRunner: Task:attempt_local_0003_m_000000_0 is done. And is in the process of commiting
11/06/07 19:03:36 INFO mapred.LocalJobRunner:
11/06/07 19:03:36 INFO mapred.TaskRunner: Task 'attempt_local_0003_m_000000_0' done.
11/06/07 19:03:36 INFO mapred.LocalJobRunner:
11/06/07 19:03:36 INFO mapred.Merger: Merging 1 sorted segments
11/06/07 19:03:36 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 377 bytes
11/06/07 19:03:36 INFO mapred.LocalJobRunner:
11/06/07 19:03:36 INFO mapred.TaskRunner: Task:attempt_local_0003_r_000000_0 is done. And is in the process of commiting
11/06/07 19:03:36 INFO mapred.LocalJobRunner:
11/06/07 19:03:36 INFO mapred.TaskRunner: Task attempt_local_0003_r_000000_0 is allowed to commit now
11/06/07 19:03:36 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0003_r_000000_0' to output2775232649978
11/06/07 19:03:36 INFO mapred.LocalJobRunner: reduce > reduce
11/06/07 19:03:36 INFO mapred.TaskRunner: Task 'attempt_local_0003_r_000000_0' done.
11/06/07 19:03:37 INFO mapred.JobClient:  map 100% reduce 100%
11/06/07 19:03:37 INFO mapred.JobClient: Job complete: job_local_0003
11/06/07 19:03:37 INFO mapred.JobClient: Counters: 12
11/06/07 19:03:37 INFO mapred.JobClient:   FileSystemCounters
11/06/07 19:03:37 INFO mapred.JobClient:     FILE_BYTES_READ=83647
11/06/07 19:03:37 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=166699
11/06/07 19:03:37 INFO mapred.JobClient:   Map-Reduce Framework
11/06/07 19:03:37 INFO mapred.JobClient:     Reduce input groups=5
11/06/07 19:03:37 INFO mapred.JobClient:     Combine output records=0
11/06/07 19:03:37 INFO mapred.JobClient:     Map input records=5
11/06/07 19:03:37 INFO mapred.JobClient:     Reduce shuffle bytes=0
11/06/07 19:03:37 INFO mapred.JobClient:     Reduce output records=5
11/06/07 19:03:37 INFO mapred.JobClient:     Spilled Records=40
11/06/07 19:03:37 INFO mapred.JobClient:     Map output bytes=335
11/06/07 19:03:37 INFO mapred.JobClient:     Combine input records=0
11/06/07 19:03:37 INFO mapred.JobClient:     Map output records=20
11/06/07 19:03:37 INFO mapred.JobClient:     Reduce input records=20

To see the output, simply right click the main project and click refresh.  You should see a weird folder appear named output where is a number representing the current timestamp of when the application ran.  The output is the 2nd argument that was passed to the program.  Inside there will be a part-r file which contains the solution to the MapReduce problem.  In my case, the output was:

1 0 2:3:
2 1 1:4:5:
3 1 1:
4 2 2:5:
5 2 2:4:

Running the application on Hadoop for real isn’t difficult once you get the bugs worked out.  Simply right click on the main project, Export, Java, Runnable Jar File, click next.  You will need to select The launch configuration that you created before (It’s probably the only one available), then select an export destination.  I choose /home/user/Desktop/had.jar.  I selected extract required libraries into generated JAR then clicked finish; I Ok’d both warnings that came up.  The resulting jar is on the desktop.

Before going to run the application on Hadoop, lets use Eclipse to upload input (the input file) to the Hadoop File System (HDFS).  To do this, expand DFS Locations, localhost, (1), right click on (1) and create a new directory “user”, right click on (1) and refresh, then right click on user and create another directory called “user”, right click on the 2nd user and click refresh.  Now you should have under (2) (the root directory), tmp, user, and under the user folder, you should have another user.  Due to the default configuration, Hadoop will execute files under the path /user/user on the HDFS, and I found it easier to just leave it this way.  Finally right click the 2nd user directory and upload files to DFS.  Select workspace, Dijkstra, input (if you’re following my tutorial).  After right clicking the 2nd user, you should see input on the HDFS.

Go into a shell and enter “cd ~/Desktop/” to get to your desktop.  Now you can execute the command “/home/user/hadoop-0.20.2/bin/hadoop jar had.jar input out”.  Hadoop will begin executing the jar on the cluster (or single node if you installed it on your machine using my tutorial).  Finally, once its done, if you go back into Eclipse and refresh the inner user directory, you will find an out folder that has the final solution (as above), but this solution actually ran on Hadoop!

So now that we know how to run Hadoop applications, exactly what does the code above do?

All Hadoop programs have three components.  The first is a driver, this is the equivalent of “public int run” in my code.  This is what starts the Hadoop job which then requires information about a Mapper and  a Reducer class (sometimes even a combiner class), it also requires where it can get the input data and output data from.

The Mapper class basically takes each line of the input file and breaks it into a key and a value.  The Key is pointless in my experience with the Mapper, but the Value has each line of the input text file.  We can split the value into keys and sub values which can be passed to the reducer class (through the context object).  The reducer will have a list of values for a single key now so that we can simply iterate through the values list for each key to gather intelligence from the data.  Finally if the reducer can’t combine the data, it can pass a new Key / Value pair to the Combiner class which will output information to the specified output file.

So now onto an explanation about Dijkstra’s algorithm.  Dijkstra’s algorithm works by taking each node and looking at its adjacency list for connected nodes.  It simply tells the connected nodes its current distance from the beginning node + 1 distance unit.  The next node will update its own distance.  At the beginning of the algorithm, all the nodes will have a distance of infinity (represented as 10000) and the starting node will have a distance of 0.  For Dijkstra’a algorithm to converge, we simply repeat the updating procedure for all the nodes until the minimum distance values for each node don’t change (called convergence).  I check this in my code by using a Hashmap (Key, Value data structure in Java) of the previous iteration of the Algorithm to compare the output on the HDFS with the output from the last run.

As a tiny modification in my code, I realized that the separator for output from the reducer was a tab character, to fix this, I simply added a configuration parameter in my run method that fixed this.

Finally, Hadoop wasn’t trivial by any means to get installed and working on my first attempt.  In addition, it took me several hours to figure out how the Mapper and Reducer actually worked.  I’m willing to bet that my explanation above isn’t enough to actually teach you how it works, but hopefully by using this example (and others online), you will have a better idea of how Hadoop works!

As a final disclaimer, I don’t endorse the direct copying and pasting of this code for your own academic project.  I’d strongly suggest learning how Hadoop works since it is cutting edge technology and is worth your time to at least gain a tiny understanding of!

As always, thanks for reading!

Tags: , , , ,
Posted in Programming

This entry was posted on Tuesday, June 7th, 2011 at 9:45 pm and is filed under Programming. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

8 Responses to “A Hadoop MapReduce Solution to Dijkstra’s Algorithm”

  1. carrera says:

    I’m just looking for some time information. After 6 hours of continuous Googleing Finally, I am in your website. I do not know what is Google’s lack of strategy, not ranked in the top of the list of this information-rich site types. In general, the top site is garbagefull.

  2. abhishek says:

    My eyes just lit up after visiting this page . Awesome ! Great help !

    Wonderful content !!

    Google should displsy this page as the first Choice !!

    Thanks Author !
    great work , full marks to you .

  3. diana says:

    Great tips! Spent hours trying to get the plugin working with Cygwin on windows. What a pain trying to find a patch for this plugin.
    Your plugin download link was very useful after giving up on Cygwin and now using Ubuntu.
    You’re map/reduce demo was also great in demonstrating how map/reduce is implemented w/ the updated API. Alot of tutorials out there are outdated.
    Keep it up :)

  4. astro says:

    Hi, you mention about combine class. However, you did not implement combine in your source code.

  5. Famous Phil says:

    The combine class in my code isn’t necessary, its a third step that the Reducer could easily do.

  6. shash says:

    I tried this same problem without using run method. I am getting the o/p as follows:
    1 0 2:3:
    2 1 1:4:5:
    3 1 1:
    4 10000 2:5:
    5 10000 2:4:

    The 4th and 5th node is giving the i/p data itself… Can u tell me wher am i going wrong!!

  7. Sebastian says:

    Perhaps I’m missing something, but isn’t this a breadth first search for the shortest path, not Dijkstra’s algorithm?

    The novelty of Dijkstra’s is that it only expands on the currently cheapest to reach node in each step, but this one expands all reached nodes in each step (which is the right thing to do in a map reduce algorithm).

  8. Khaled Ammar says:

    yes Sebastian,

    This is not Dijkstra, it is a distributed version for breadth first. Dijkstra is more suitable for a single processor machine. This is supposed to be a distributed environment.

Leave a Reply


*