Wednesday, February 9, 2011

CMU Pegasus on Hadoop

Pegasus is a peta scale graph mining library.
This post explains how to install it on Amazon EC2.

1) Start with the Amazon EC2 image you created in http://bickson.blogspot.com/2011/01/how-to-install-mahout-on-amazon-ec2.html
2) Run Hadoop on a single node as explained in http://bickson.blogspot.com/2011/01/mahout-on-amazon-ec2-part-2-testing.html
3) Login into the EC2 machine
4) wget http://www.cs.cmu.edu/%7Epegasus/PEGASUSH-2.0.tar.gz
5) tar xvzf PEGASUSH-2.0.tar.gz
6) cd PEGASUS
7) export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/hadoop-0.20.2/bin/
8)  sudo apt-get install gnuplot
9) ./pegasus.sh
PEGASUS> demo
put: Target pegasus/graphs/catstar/edge/catepillar_star.edge already exists
Graph catstar added.
rmr: cannot remove dd_node_deg: No such file or directory.
rmr: cannot remove dd_deg_count: No such file or directory.

-----===[PEGASUS: A Peta-Scale Graph Mining System]===-----

[PEGASUS] Computing degree distribution. Degree type = InOut

11/02/09 14:47:36 INFO mapred.FileInputFormat: Total input paths to process : 1
11/02/09 14:47:36 INFO mapred.JobClient: Running job: job_201102091432_0003
11/02/09 14:47:37 INFO mapred.JobClient:  map 0% reduce 0%
11/02/09 14:47:45 INFO mapred.JobClient:  map 18% reduce 0%
11/02/09 14:47:48 INFO mapred.JobClient:  map 36% reduce 0%
11/02/09 14:47:51 INFO mapred.JobClient:  map 54% reduce 0%
11/02/09 14:47:54 INFO mapred.JobClient:  map 72% reduce 18%
11/02/09 14:47:57 INFO mapred.JobClient:  map 90% reduce 18%
11/02/09 14:48:00 INFO mapred.JobClient:  map 100% reduce 18%
11/02/09 14:48:03 INFO mapred.JobClient:  map 100% reduce 24%
11/02/09 14:48:09 INFO mapred.JobClient:  map 100% reduce 100%
11/02/09 14:48:11 INFO mapred.JobClient: Job complete: job_201102091432_0003
11/02/09 14:48:11 INFO mapred.JobClient: Counters: 18
11/02/09 14:48:11 INFO mapred.JobClient:   Job Counters
11/02/09 14:48:11 INFO mapred.JobClient:     Launched reduce tasks=1
11/02/09 14:48:11 INFO mapred.JobClient:     Launched map tasks=11
11/02/09 14:48:11 INFO mapred.JobClient:     Data-local map tasks=11


An image named catstar_deg_inout.eps will be created.

No comments:

Post a Comment