Monday, June 25, 2012

Hadoop cluster setup : Firewall issues


Hadoop cluster setup : Firewall issues

Expectations: This blog entry is not a step-by-step guide to setup hadoop cluster. There are numerous articles on setting up hadoop cluster. The intent of this blog is to provide a solution for a couple of issues that i have faced while setting-up the cluster (unfortuantely, i couldnt able to find a direct answer for these issues in google, so blogging over here)


Recently, i was tasked to create a new hadoop cluster on our new CentOS machines. The first time when i created cluster, i could able to create them succesfully. But with the new machines, i ran into few problems.

Issue 1 # DataNode cannot connect to NameNode
Call to master/192.168.143.xxx:54310 failed on local exception: java.net.NoRouteToHostException: No route to host

1) Configured everything & when i started the namenodes & datanodes using
# cd $HADOOP_HOME
# ./bin/start-dfs.sh

NameNode logs:

 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: NameNode up at: master/192.168.143.211:54310
2012-06-25 19:27:40,338 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 54310: starting


Namenode has been started succesfully


DataNode logs:
ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: 

Call to master/192.168.xxx.xxx:54310 failed on local exception: java.net.NoRouteToHostException: No route to host
        at org.apache.hadoop.ipc.Client.wrapException(Client.java:1063)

        at org.apache.hadoop.ipc.Client.call(Client.java:1031)
        at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:198)
      at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)

        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:599)
        ... 13 more     
Caused by: java.net.NoRouteToHostException: No route to host


This clearly says that datanode machines cannot connect to namenode. So i tried hitting the namenode UI in the browser

http://192.168.143.xxx:50070 (masked IP)

Ans: Failed to connect to UI. Got timedout. Seems something wrong with the namenode.

but when i did a telnet to that (namenode) port
# telnet 192.168. xxx .xxx  50070

Trying 192.168. xxx .xxx...
Connected to 192.168. xxx .xxx.
Escape character is '^]'.

So this tells that namenode is up & running but its not available to outside. so, it seems the problem is with firewall. So tried to disable firewall on my namenode machine.

Login as a root to the namenode machine & execute the following commands.
# service iptables save
# service iptables stop
# chkconfig iptables off

After disabling the firewall, restarted the dfs. Now my datanodes can connect to my namenode & Namenode UI is also working fine.


Issue2: 
Error: java.io.IOException: File /tmp/hadoop-hadoop/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1

The main cause of this problem is config (99%). This is mainly due to your conf/slaves files or your /etc/hosts entries. There are many blogs on addressing this issue. But the remaining 1% of the time, this is due to firewall issues from your datanodes. So run the above the commands to disable the firewall on your datanode machines. 

Restarted my mapred.sh & everything looks good now.

Issue3: If you are seeing the following exception while pushing a file to HDFS, then you need to disable the firewall on the slave machine.

INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.NoRouteToHostException: No route to host
INFO hdfs.DFSClient: Abandoning block blk_3519823924710640125_1087
INFO hdfs.DFSClient: Excluding datanode 192.168.xxx.xxx:50010


Friday, June 1, 2012

Managing Hadoop Cluster for newbies


When i installed hadoop, i started with all default settings & everything was making use of /tmp directories. The default locations are non-advisable & should be immediately changed for real practical use.

Here is the table which lists the dedault location & suggested location (assuming you have already created hadoop user).

DirectoryDescriptionDefault locationSuggested location
HADOOP_LOG_DIROutput location for log files from daemons${HADOOP_HOME}/logs/var/log/hadoop
hadoop.tmp.dirA base for other temporary directories/tmp/hadoop-${user.name}/tmp/hadoop
dfs.name.dirWhere the NameNode metadata should be stored${hadoop.tmp.dir}/dfs/name/home/hadoop/dfs/name
dfs.data.dirWhere DataNodes store their blocks${hadoop.tmp.dir}/dfs/data/home/hadoop/dfs/data
mapred.system.dirThe in-HDFS path to shared MapReduce system files${hadoop.tmp.dir}/mapred/system/hadoop/mapred/system


Majority of hadoop settings resides in xml configuration files. Prior to hadoop 0.20, everything is part of hadoop-default.xml and hadoop-site.xml. As the name itself conveys, hadoop-default xml contains all default settings & if you want to override anything; then hadoop-site.xml is the file to work on.

If you are like me (running hadoop 1.x) running later versions (any thing > 0.20), this hadoop-site.xml has been sepearted into
- core-site.xml : We specify the hostname and port of the Namenode
- hdfs-site.xml : We specify the hostname and port of the JobTracker
- mapred-site.xml : We specify the replication factor for Hdfs.

So, we can add these namenode & datanode dirctories in hdfs-site.xml


<property>
<name>dfs.name.dir</name>
<value>/home/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/hadoop/dfs/data</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/tmp/hadoop</value>
</property>