I configured and deployed hadoop for single node setup via this tutorial.
Everything deployed fine, but when I do jps for have a look on active processes, datanode is not shown.
I manually tried to start datanode by going to $HADOOP_HOME/bin : hadoop -datanode
, but to no avail.
Basically to sum up, datanode process is not running at all for the hadoop cluster.
Also, I want to know whether a single machine can have 2 hadoop installations. I am using one for mapreduce processes and another one for search engine. ? so their directory being different, is that okay ? also, I run a single hadoop operation at a time.
EDIT 1#:-
if this helps, here's the log when i tried running datanode via $HADOOP_HOME
.
root@thinktank:/usr/local/hadoop/bin# hadoop datanode
Warning: $HADOOP_HOME is deprecated.
13/08/27 16:34:57 INFO datanode.DataNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG: host = thinktank/127.0.1.1
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.2.1
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.2 -r 1503152; compiled by 'mattf' on Mon Jul 22 15:23:09 PDT 2013
STARTUP_MSG: java = 1.6.0_27
************************************************************/
13/08/27 16:34:57 INFO impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
13/08/27 16:34:57 INFO impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
13/08/27 16:34:57 INFO impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
13/08/27 16:34:57 INFO impl.MetricsSystemImpl: DataNode metrics system started
13/08/27 16:34:57 INFO impl.MetricsSourceAdapter: MBean for source ugi registered.
13/08/27 16:34:57 WARN impl.MetricsSystemImpl: Source name ugi already exists!
13/08/27 16:34:57 ERROR datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /app/hadoop/tmp/dfs/data: namenode namespaceID = 1955988395; datanode namespaceID = 1705269445
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:232)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:147)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:414)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:321)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1712)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1651)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1669)
at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1795)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1812)
13/08/27 16:34:57 INFO datanode.DataNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at thinktank/127.0.1.1
************************************************************/
Okay,i find the workaround. It seems like i was having the error : Incompatible namespaceIDs. I found a work around here . So well it finally got solved.
If you are also having the same problem for Incompatible namespaceIDs, try the following, it worked like a charm for me. Leave comments if you still have problems and I will get back to you.
Solution :
1. Stop the problematic DataNode(s).
2. Edit the value of namespaceID in ${dfs.data.dir}/current/VERSION to match the corresponding value of the current NameNode in ${dfs.name.dir}/current/VERSION.
3. Restart the fixed DataNode(s). That will solve the problem for you.
hadoop datanode -start
.The problem is due to Incompatible namespaceID.So, remove tmp directory using commands
Then follow the steps from:
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/