Bigtop (v1.3.0) Build, Smoketest, and Deploy on Multiple Physical Machines


This is report about how Bigtop should be build, smoketest and deploy. Although the work of this report is mainly done  Arm Server and CentOS 7.5, it is truly usable on x86 servers too. 

Bigtop 1.3.0 is used in this report.



1 CentOS 7.5 Install

Note: This section is machine specific. For the rest of this document, commands and environment for reference is CentOS 7.5.


2 Install docker & git & docker-compose & ruby

2.1 Install docker for CentOS

Refer to steps officially in: https://docs.docker.com/install/linux/docker-ce/centos/#set-up-the-repository

$ sudo systemctl start docker

$ sudo systemctl enable docker

$ sudo docker run hello-world

$ sudo docker ps

Note: If you would like to use Docker as a non-root user, you should now consider adding your user to the “docker” group with something like:

$  sudo usermod -aG docker your-user

Note: Remember to log out and back in for this to take effect!

2.2 Install docker-compose & ruby

$ sudo yum install git docker-compose ruby

Note: docker-compose and ruby are required by docker-hadoop.sh

3 Build & Smoke Testing of Bigtop

3.1 Source Code Downloading


3.1.1 Bigtop Official Release 1.3.0

Refer:

https://github.com/apache/bigtop

$ git clone https://github.com/apache/bigtop.git

$ cd bigtop/

$ git checkout -b working-rel/1.3.0 rel/1.3.0

3.2 Start a build container

Before starting the container, give other users `w` access to `bigtop` home directory. It is required for gradle installation as 'jenkins' users. Otherwise, you will see this error when run 'gradlew tasks'. FAILED: Could not create service of type CrossBuildFileHashCache using BuildSessionScopeServices.createCrossBuildFileHashCache().

$ cd ..

$ chmod a+w bigtop/

$ cd bigtop

Now, you can start the container based on Image: "bigtop/slaves:1.3.0-centos-7-aarch64".

$ docker run -it --rm  -u jenkins --workdir /ws  -v `pwd`:/ws bigtop/slaves:1.3.0-centos-7-aarch64 bash -l

Note:

  1. User 'jenkins' is employed. It exists by default in the root docker image of centos 7.0.

  2. It's not allowed using 'root' to build bigtop. Some component refuses to be built in root.

  3. Image "bigtop/slaves:1.3.0-centos-7-aarch64" will be retrieved from docker hub on live.

3.3 Env. setup

[jenkins@eb7597605841 ws] $ . /etc/profile.d/bigtop.sh

Note: bigtop.sh sets environments variables such as: JAVA_HOME, MAVEN_HOME, ANT_HOME, GRADLE_HOME, etc.

$ cd /ws

$ ./gradlew tasks

Note:

  1. This will initiate gradle installation in current docker container.

  2. Using $ ./gradlew tasks | grep "\-rpm" to see all supported components.                                                                                                                                               

alluxio-rpm - Building RPM for alluxio artifacts

ambari-rpm - Building RPM for ambari artifacts

apex-rpm - Building RPM for apex artifacts

bigtop-groovy-rpm - Building RPM for bigtop-groovy artifacts

bigtop-jsvc-rpm - Building RPM for bigtop-jsvc artifacts

bigtop-tomcat-rpm - Building RPM for bigtop-tomcat artifacts

bigtop-utils-rpm - Building RPM for bigtop-utils artifacts

crunch-rpm - Building RPM for crunch artifacts

datafu-rpm - Building RPM for datafu artifacts

flink-rpm - Building RPM for flink artifacts

flume-rpm - Building RPM for flume artifacts

giraph-rpm - Building RPM for giraph artifacts

gpdb-rpm - Building RPM for gpdb artifacts

hadoop-rpm - Building RPM for hadoop artifacts

hama-rpm - Building RPM for hama artifacts

hbase-rpm - Building RPM for hbase artifacts

hive-rpm - Building RPM for hive artifacts

ignite-hadoop-rpm - Building RPM for ignite-hadoop artifacts

kafka-rpm - Building RPM for kafka artifacts

mahout-rpm - Building RPM for mahout artifacts

phoenix-rpm - Building RPM for phoenix artifacts

qfs-rpm - Building RPM for qfs artifacts

solr-rpm - Building RPM for solr artifacts

spark-rpm - Building RPM for spark artifacts

spark1-rpm - Building RPM for spark1 artifacts

sqoop-rpm - Building RPM for sqoop artifacts

sqoop2-rpm - Building RPM for sqoop2 artifacts

tajo-rpm - Building RPM for tajo artifacts

tez-rpm - Building RPM for tez artifacts

ycsb-rpm - Building RPM for ycsb artifacts

zeppelin-rpm - Building RPM for zeppelin artifacts

zookeeper-rpm - Building RPM for zookeeper artifacts

Totally 32 components.

3.4 Build rpm, yum, and repo

# ./gradlew allclean

# ./gradlew rpm

Note: rpm - Build all RPM packages for the stack

# ./gradlew yum

Note: yum - Creating YUM repository

# ./gradlew repo

Note:

  • repo - Invoking a native repository target yum

  • It equals to 'createrepo ...'.

  • This command creates ./repodata folder under [bigtop]/output. 'repodata' directory holds the metadata information for the newly created repository.


3.5 Deploy & Smoke Test w/ Docker

Bigtop uses docker as an easy way to deploy multi-node cluster and to do smoke tests. Here is how. To start with, you need a .yaml config file.

3.5.1 Yaml config file contents:

In [bigtop]/provisioner/docker/working-erp-18.06_centos-7.yaml

docker:

       memory_limit: "16g"

       image: "bigtop/puppet:1.3.0-centos-7-aarch64"

repo: "file:///bigtop-home/output"

distro: centos

components: [hdfs, yarn, mapreduce, zookeeper, hbase, hive, spark]

enable_local_repo: false

smoke_test_components: [hdfs,spark]

3.5.2 Deploy w/ docker containers

$ cd provisioner/docker/

$ ./docker-hadoop.sh -C working-erp-18.06_centos-7.yaml -c 5

- pass, in about 10 minutes.

This step goes pretty quick. It doesn't download anything from external network.

3.5.3 Smoke tests

Edit config file (working-erp-18.06_centos-7.yaml) to set which components to smoke test:

Eg.

smoke_test_components: [hdfs,spark]

$ ./docker-hadoop.sh -C working-erp-18.06_centos-7.yaml -s

During smoke test, it downloads ....

3.5.3.1 Analysis why smoke test downloads:

Test framework first evaluate which tasks need to be performed in order to fulfill the user's smoke test request. Then, it starts './gradlew ' to execute these tasks.

Among the tasks for hdfs smoke-test, these two depends on external network downloading:

task ':bigtop-tests:smoke-tests:hdfs:compileJava'

- download ... pom, ..jar

task ':bigtop-tests:smoke-tests:hdfs:compileTestGroovy'

- download ... pom, ..jar


These tasks are only one-time needed. When run hdfs smoke-test the second time, the above downloading doesn't happen any more.

4 Local Yum Repository Setup

This is to set up an HTTP file server to publish Bigtop build results. In later steps, when deploying, it will be specified as the repo URI where puppet can download Bigtop binaries from.

4.1 Set Hostname

Bigtop configuration requires FQDN for each machine in the cluster. Name D05 servers in the following rules:

d05-<%03d>bigtop.deploy

Eg.  

d05-001.bigtop.deploy

d05-002.bigtop.deploy

In the following paragraphs, d05-001 will be used as master node. The others will be used as slave(s).

To Set FQDN on each machine, do the following:

  • Set FQDN for the machine

$ sudo hostnamectl set-hostname d05-001.bigtop.deploy
$ sudo hostname

  • Update /etc/hosts

$ sudo vi /etc/hosts
-append these lines:

192.168.10.141 d05-001.bigtop.deploy d05-001

192.168.10.177 d05-002.bigtop.deploy d05-002

192.168.10.162 d05-003.bigtop.deploy d05-003

192.168.10.221 d05-003.bigtop.deploy d05-004

192.168.10.248 d05-003.bigtop.deploy d05-005

192.168.10.155 d05-003.bigtop.deploy d05-006


4.2 Setup Local Nginx HTTP Server

Ref to Step 1 in: https://www.tecmint.com/setup-local-http-yum-repository-on-centos-7/

# yum install epel-release

# yum install nginx

# systemctl start nginx

# systemctl enable nginx

# systemctl status nginx

# firewall-cmd --zone=public --permanent --add-service=http

# firewall-cmd --zone=public --permanent --add-service=https

# firewall-cmd --reload

Note: now you can open another machine, and verify this by "wget http://d05-001.bigtop.deploy"

4.3 Publish Bigtop /Output through Nginx


# mkdir -p /usr/share/nginx/html/releases/1.3.0/centos/1/aarch64

Note: 1 is the build number. It should be changed accordingly.;

# cd /usr/share/nginx/html/releases/1.3.0/centos/1/aarch64

# rsync -a --delete /home/guodong/bigtop/output/ .

# vi /etc/nginx/nginx.conf

(Note: insert blue lines into "http -> server" section)

http {

   ...

   server {

       …

       root         /usr/share/nginx/html;

       …

       location /releases/ {

              autoindex on; #enable listing of directory index

       }

       …

       location / {

       }

# systemctl restart nginx


Note: now you can open another machine, and verify this by "wget http://d05-001.bigtop.deploy/releases/1.3.0/centos/1/aarch64"

5 Deploy Bigtop on Multiple Nodes

Ref: https://github.com/apache/bigtop/blob/master/bigtop-deploy/puppet/README.md

Learn puppet and hiera as well.

5.1 Deploy on master node

5.1.1 Set Hostname

Refer to Set Hostname(s).

5.1.2 Disable Firewall

Bigtop components (hdfs, yarn, etc.) use a lot of port for receiving services and connections between nodes. To make them work well, it's better to disable firewall so all ports are accessed through.

Method to disable firewall on CentOS 7, please refer: https://linuxize.com/post/how-to-stop-and-disable-firewalld-on-centos-7/

$ sudo systemctl stop firewalld

$ sudo systemctl disable firewalld

$ sudo systemctl mask --now firewalld

Note: (NOT RECOMMEND) An alternative way is to open each and every ports specifically. Difficulty of doing this is not easy to list all ports completely.

To enable a specific port, Eg:

The following ports need to be open on master node.

8020: hadoop

8032: yarn

$ sudo firewall-cmd --zone=public --permanent --add-port=8020/tcp

$ sudo firewall-cmd --zone=public --permanent --add-port=8032/tcp

$ sudo firewall-cmd --reload

5.1.3 Preparation

$ sudo yum -y install java-1.8.0-openjdk

$ java -version

openjdk version "1.8.0_191"

  • Install unzip curl (it's required by gradle installation)

$ sudo yum -y install unzip curl

  • Install puppet and puppetlabs-stdlib

Refer to: https://github.com/apache/bigtop/blob/master/bigtop_toolchain/bin/puppetize.sh

$ sudo rpm -ivh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

$ sudo yum updateinfo

       # BIGTOP-3088: pin puppetlabs-stdlib to 4.12.0 as the one provided by

       # distro (4.25.0) has conflict with puppet<4. Should be removed once

       # puppet in distro is updated.

       $ sudo yum -y install hostname curl sudo unzip wget puppet

       $ sudo puppet module install puppetlabs-stdlib --version 4.12.0

  • Install puppet-modules (depends on puppet)

$ sudo ./gradlew toolchain-puppetmodules

BUILD SUCCESSFUL in 2m 18s

5.1.4 Copy hiera and hieradata to /etc for puppet

$ cd ~/bigtop/
$ sudo cp bigtop-deploy/puppet/hiera.yaml /etc/puppet
$ sudo mkdir -p /etc/puppet/hieradata
$ sudo rsync -a --delete bigtop-deploy/puppet/hieradata/site.yaml bigtop-deploy/puppet/hieradata/bigtop /etc/puppet/hieradata/

5.1.5 Edit /etc/puppet/hieradata/site.yaml


$ sudo vi /etc/puppet/hieradata/site.yaml

bigtop::hadoop_head_node: "d05-001.bigtop.deploy"

hadoop::hadoop_storage_dirs:

 - /mnt/sda2

 - /mnt/sdc1-1

 - /mnt/sdd1

 - /mnt/sde1

 - /mnt/sdf1

 - /mnt/sdg1

 - /mnt/sdh1

 - /mnt/sdi1

 - /mnt/sdc1

 - /mnt/sdk1

 - /mnt/sdl1

hadoop_cluster_node::cluster_components:

 - hdfs

 - yarn

 - mapreduce

 - zookeeper

 - kafka

 - hbase

 - hive

 - spark

 - flink

bigtop::bigtop_repo_uri: "http://d05-001.bigtop.deploy/releases/1.3.0/centos/2/aarch64"

Major parameters are explained below.

5.1.5.1 hadoop::hadoop_storage_dirs:

These are folders (physical drives) which allocated to HDFS. You want to give more physical drives to HDFS as possible to increase I/O parallelism. Two things need to do,

  1. To release physical disks from LVM, refer to here.

  2. To format newly added physical drives and mount them as XFS filesystem, here is a script to help.

5.1.5.2 bigtop::bigtop_repo_uri:

This is the URL where to retrieve Bigtop build artifacts. There are three major ways regarding where Bigtop artifacts can be retrieved.

  1. Use Bigtop official release URL. Ref: https://www.apache.org/dyn/closer.lua/bigtop/bigtop-1.3.0/repos/

    1. Eg. for CentOS, download bigtop.repo and find 'baseurl=...'.

  2. Create an offline Bigtop release repository by downloading all the bigtop repositories to local, using 'reposync' command.

    1. This is useful when you want to install bigtop when there is no internet access and you cannot build from source either.

    2. Details how, please check Section here.

  3. Build from source, then publish via Nginx. See sections above: this and this.

5.1.5.3 hadoop_cluster_node::cluster_components:

This is the list of which components to install on this node. Just component names are required, no need to care about roles.

'puppet apply' script will figure out proper roles.

5.1.5.4 bigtop::hadoop_head_node:

Head node (aka. master node) is specified here. 'puppet apply' use this to decide which roles should be launched on which nodes.

5.1.6 Deploy using puppet


$ cd ~/bigtop/
$ sudo puppet apply -d --parser future --modulepath="bigtop-deploy/puppet/modules:/etc/puppet/modules" bigtop-deploy/puppet/manifests

Notice: Roles to deploy: [resourcemanager, nodemanager, mapred-app, hadoop-client, zookeeper-server, zookeeper-client, kafka-server, hbase-master, hbase-server, hbase-client, hive-server2, hive-metastore, hive-client, spark-on-yarn, spark-yarn-slave, spark-client, flink-jobmanager, flink-taskmanager, namenode, datanode]

... ...

Notice: Finished catalog run in 663.71 seconds

To confirm the installation is correct:

$ sudo jps

21184 Jps

2560 NodeManager

2673 JobHistoryServer

11461 RunJar

11733 RunJar

2645 ThriftServer

2678 NameNode

2567 ResourceManager

2568 WebAppProxyServer

2680 DataNode

2458 QuorumPeerMain

2767 HRegionServer


5.1.7 Issues met during second deploy


Log:

Debug: Executing '/usr/bin/sudo -u hive /usr/lib/hive/bin/schematool -dbType derby -initSchema'

Fix:

$ sudo rm -rf /var/lib/hive/metastore/metastore_db

5.2 Deploy on slave nodes

5.2.1 Clone source code

Note: this is for downloading [bigtop]/deploy related code. Not for building.

$ git clone https://github.com/apache/bigtop.git

$ cd bigtop/

$ git checkout -b working-rel/1.3.0 rel/1.3.0

5.2.2 Disable Firewall

Refer to Disable Firewall.

5.2.3 Set Hostname and Update /etc/hosts

Please refer to Set Hostname(s).

Note, /etc/hosts need to be updated. This is to allow slave node to reach master node. and vice versa. Remember to update master node's /etc/hosts too.

5.2.4 Repeat Same Steps as master node

Repeat same steps as in Deploy on master node.

Logs when running `puppet apply`,

Notice: Roles to deploy: [nodemanager, mapred-app, zookeeper-server, hbase-server, spark-on-yarn, spark-yarn-slave, datanode]

...

Notice: Finished catalog run in 363.87 seconds

, which usually means success.

To confirm the deployment:

$ sudo jps

12306 Jps

11860 NodeManager

1949 QuorumPeerMain

1981 DataNode

5.2.5 Confirm Node(s) identified by Master

Run these commands to confirm that the newly deployed slave nodes are registered well on master.

$ hdfs dfsadmin -printTopology

or,

$ hdfs dfsadmin -report

Live datanodes (3):

Note: the number of datanodes, and their details such as IP, Hostname:Port.

$ yarn node -list

Total Nodes:3

Note: the number of nodes, and their details such as Hostname:Port.