Does hadoop take care of different node HD size alone?
772
I have a single node (pseudo-distributed config) and I'm considering adding a 2nd slave node.
Does it matter if the slave has less HD capacity ? Will the rebalance take of that for itself. I'm not an HADOOP expert by far.
No it doesn't matter but HDFS will not redistribute the blocks to the new node automatically so you will have to do that on your side. The easiest way is to run bin/start-balancer.sh. Also, before you do any rebalancing, make sure you modify your conf files accordingly to accommodate moving away from a pseudo-distributed configuration to a cluster one.
Check this question on the Hadoop FAQ for more ways to rebalance.
No it doesn't matter but HDFS will not redistribute the blocks to the new node automatically so you will have to do that on your side. The easiest way is to run
bin/start-balancer.sh
. Also, before you do any rebalancing, make sure you modify your conf files accordingly to accommodate moving away from a pseudo-distributed configuration to a cluster one.Check this question on the Hadoop FAQ for more ways to rebalance.
Hadoop will balance the load. In addition you can set "dfs.replication" property to set the number of replications you want.