上QQ阅读APP看书，第一时间看更新

Installing Hadoop cluster

The following steps need to be performed in order to install Hadoop cluster. As the time of writing this book, Hadoop Version 2.7.3 is a stable release. We will install it.

Check the Java version using the following command:

Java -version
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)
You need to have Java 1.6 onwards

Create a Hadoop user account on all the servers, including all NameNodes and DataNodes with the help of the following commands:

useradd hadoop
passwd hadoop1

Assume that we have four servers and we have to create a Hadoop cluster using all four servers. The IPs of these four servers are as follows: 192.168.11.1, 192.168.11.2, 192.168.11.3, and 192.168.11.4. Out of these four servers, we will first use a server as a master server (NameNode) and all remaining servers will be used as slaves (DataNodes).

On both servers, NameNode and DataNodes, change the /etc/hosts file using the following command:

vi /etc/hosts--

Then add the following to all files on all servers:

NameNode 192.168.11.1
DataNode1 192.168.11.2
DataNode2 192.168.11.3
DataNode3 192.168.11.4

Now, set up SSH on NamesNodes and DataNodes:

su - hadoop
ssh-keygen -t rsa
ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@namenode
ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@datanode1
ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@datanode2
ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@datanode3
chmod 0600 ~/.ssh/authorized_keys
exit

Download and install Hadoop on NameNode and all DataNodes:

mkdir /opt/hadoop
cd /opt/hadoop
wget http://www-eu.apache.org/dist/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
tar -xvf hadoop-2.7.3.tar.gz 
mv Hadoop-2.7.3 hadoop
chown -R hadoop /opt/hadoop
cd /opt/hadoop/Hadoop