Cloudera CDH cluster setup: making nodes FQDN resolvable

Today we will do the last preparations related to Cloudera CDH cluster setup. We need to make names of our cluster’s node resolvable – meaning that if i would like to connect from one cluster node to another, for example, using ssh, i could do that not only using IP address of the destination node, but also its Fully Qualified Domain Name (FQDN). So it would be possible to do like this:

ssh node1.dataguru.guide
ssh 10.0.2.4

So the second option should work already, in the last video related to cluster setup it was shown what should be done to achieve that. If you haven’t seen it yet, the link to that video is available on the bottom of this page.

Today we are going to do some manipulations wit config files on our nodes to make the first option with node’s name work as well.

Open you hostname config file using a console text editor vi:

sudo vi /etc/hostname

You should specify there the full name of your host, including domain. In my case, it is node1.dataguru.guide. Once you are done, save and close the editor by taping :wq

Next, we are going to make our hostnames resolvable by other nodes in our network. If we had the domain dataguru.guide registered by a domain registrant and setup DNS records accordingly, we wouldn’t need to do that, since it will be resolved automatically. You don’t need to do anything to make your PC connect to google.com, right?

In our case we are using VMs with fake domains, so we need to take some additional actions to make this work.

Open your hosts config file in vi editor:

sudo vi /etc/hosts

There you can see all bindings currently available on your particular node, one binding per line in the form of <IP address> <FQDN>.
In my example, I would like to make my node be able to connect to Cloudera Manager node via FQDN, so I add the following line:

10.0.2.7 manager.dataguru.guide

By adding this line, I inform Linux OS that if somewhere I use manager.dataguru.guide name, the 10.0.2.7 IP address should be used to connect to that node. You should add as many lines to /etc/hosts file, as many nodes should be reachable from the particular node. For example, in my cluster I have Manager node and 3 worker nodes, meaning that on each node I need to add 3 lines, making three other nodes being accessible from the given node.

Another thing we should do – is to make our user being able to use sudo privileges without being asked to enter password. That’s the requirement that comes from Cloudera Manager side. How to do that will be shown in the following post, just not to mess two different things together. However, both these things will be included in 1 video, which should be available on the channel later today.

If you still have any questions, please do not hesitate to ask them in comments 🙂

Leave a Reply