Sunday, September 12, 2010

Compiling and Running Hadoop WordCount Example in NetBeans

In order to compile and run the Hadoop WordCount example in NetBeans, I followed these steps:

  1. Create a new Java project called hadoop.
  2. Copy the src/examples/org/apache/hadoop/WordCount.java to src/hadoop.
  3. Select Run>Set Project Configuration>Customize and change the Main class entry to hadoop.WordCount. Also set the arguments entry to 'input output".
  4. Add the following jar files as libraries: hadoop-0.20.2-core.jar, commons-cli-1.2.jar, commons-logging-1.0.4.jar, commons-httpclient-3.0.1.jar
  5. Add the following method to the WordCount class:
    static public boolean deleteDirectory(File path) {
            if (path.exists()) {
                File[] files = path.listFiles();
                for (int i = 0; i < files.length; i++) {
                    if (files[i].isDirectory()) {
                        deleteDirectory(files[i]);
                    } else {
                        files[i].delete();
                    }
                }
            }
            return (path.delete());
        }
  6. Add the following lines of code just before the "new Job" line:
      // delete the output directory.
       WordCount.deleteDirectory(new File(otherArgs[1]));
  7. Create the input directory.
  8. Copy a text file into the input directory.
  9. Press F6 to run the program.
  10. Read the files in the output directory.
Post a Comment