I'm trying to add nodes to a Cloudera cluster. When the agent starts I get a python stacktrace saying it can't heartbeat to master-host:7182, however I can connect to that port just fine.
The stacktrace is from Python and ends saying the connection timed out.
nc -z 1 -w master-host 7182
returns "connection successful"
Firewalls are off, SELinux is in Permissive.
Each box has 2 IPs, one in the 4 space and one in the 8 space. The DNS resolves the 8 address and the hosts file resolves the 4 address.
EDIT: To add some more info, based on this post:
- OS Versions are the same, agent/manager versions are the same
- I can connect from the CM host to the 4 address, port 9000. The 4 address is the one that shows up in the Host page on Cloudera Manager
- The large ping command fails on the 4 address:
ping -c 3 -s 1800 4-address
, MTU for this interface is set to 9000. - The large ping command passes on the 8 address, MTU is set to 1500.
Turns out the MTU appeared to be the issue - the infrastructure we're using wasn't supporting jumbo frames end-to-end (in this case, Cisco c240m4s with fibre interconnects needed the QoS settings updating via UCS).