Thursday, May 31, 2012

Hadoop: Starting in Pseudo-distributed mode


In this blog entry; i am gonna list common mistakes/gotchas while starting Hadoop :

1) If you get the following exception while starting hadoop

localhost: Exception in thread "main" java.lang.IllegalArgumentException: Does not contain a valid host:port authority: file:///


Then, its mostly likely
a) Either the mapred configuration is empty or not valid
# cat conf/mapred-site.xml

<configuration>

</configuration>

b) or you are still pointing your conf to localmode (Refer #2 for switching different modes with a shortlink) & starting the hadoop



2) Switching between modes:
For newbies; To switch between local-mode, pseudo-distributed mode, fully distributed mode; it is advisable to create 3 different directories

conf.standalone, conf.pseudo, conf.full

Then just create a soft link to point to the appropriate mode. For Ex:
# cd $HADOOP_HOME
# ln -s conf.standalone conf

conf -> conf.standalone