Modern Big Data Processing with Hadoop
上QQ阅读APP看书,第一时间看更新

Installing Hadoop cluster

The following steps need to be performed in order to install Hadoop cluster. As the time of writing this book, Hadoop Version 2.7.3 is a stable release. We will install it.

  1. Check the Java version using the following command:
Java -version
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode) You need to have Java 1.6 onwards
  1. Create a Hadoop user account on all the servers, including all NameNodes and DataNodes with the help of the following commands:
useradd hadoop
passwd hadoop1 

Assume that we have four servers and we have to create a Hadoop cluster using all four servers. The IPs of these four servers are as follows: 192.168.11.1, 192.168.11.2, 192.168.11.3, and 192.168.11.4. Out of these four servers, we will first use a server as a master server (NameNode) and all remaining servers will be used as slaves (DataNodes).

  1. On both servers, NameNode and DataNodes, change the /etc/hosts file using the following command:
vi /etc/hosts--   
  1. Then add the following to all files on all servers:
NameNode 192.168.11.1
DataNode1 192.168.11.2
DataNode2 192.168.11.3
DataNode3 192.168.11.4 
  1. Now, set up SSH on NamesNodes and DataNodes:
su - hadoop
ssh-keygen -t rsa
ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@namenode
ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@datanode1
ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@datanode2
ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop@datanode3
chmod 0600 ~/.ssh/authorized_keys
exit
  1. Download and install Hadoop on NameNode and all DataNodes:
mkdir /opt/hadoop
cd /opt/hadoop
wget http://www-eu.apache.org/dist/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz
tar
-xvf hadoop-2.7.3.tar.gz
mv Hadoop-2.7.3 hadoop
chown -R hadoop /opt/hadoop
cd /opt/hadoop/Hadoop