I've got a cluster I'm setting up Cloudera 3 on, and it is unclear on whether I should be using these start/stop scripts like I used to with the standard Apache Hadoop setup (where I had a specific user account that ran all the Hadoop stuff). With CDH3, I got these services running as root.
What is the suggested secure and convenient way to issue global starts and stops with Cloudera 3? Do users who want to use these scripts setup keys so root can login to the other boxes? If so, how is this setup by a sudo'd user (just sudo su
and setup keys?)?
Also, when I tried to run sudo /usr/lib/hadoop-0.20/bin/start-all.sh
(with no keys setup for the root user to connect to the remove boxes), I get the following output:
starting namenode, logging to /usr/lib/hadoop-0.20/bin/../logs/hadoop-root-namenode-meez01.out
May not run daemons as root. Please specify HADOOP_NAMENODE_USER
<asks me for root@myserver's password>
....
<similar message for jobtracker>
update: I know someone out there has to be using Cloudera.. is this a conspiracy to get me to pay for their support?! j/k. if anything, let mek now how you are using it
I was looking for similar info, and found it here:
http://www.migrate2cloud.com/blog/hadoop-cluster-with-hadoop-0-20-and-ubuntu-10-04
You can start/stop using the /etc/init.d/hadoop* scripts