Bigtop (v1.3.0) Build, Smoketest, and Deploy on Multiple Physical Machines
This is report about how Bigtop should be build, smoketest and deploy. Although the work of this report is mainly done Arm Server and CentOS 7.5, it is truly usable on x86 servers too.
Bigtop 1.3.0 is used in this report.
1 CentOS 7.5 Install
Note: This section is machine specific. For the rest of this document, commands and environment for reference is CentOS 7.5.
2 Install docker & git & docker-compose & ruby
2.1 Install docker for CentOS
Refer to steps officially in: https://docs.docker.com/install/linux/docker-ce/centos/#set-up-the-repository
$ sudo systemctl start docker
$ sudo systemctl enable docker
$ sudo docker run hello-world
$ sudo docker ps
Note: If you would like to use Docker as a non-root user, you should now consider adding your user to the “docker” group with something like:
$ sudo usermod -aG docker your-user
Note: Remember to log out and back in for this to take effect!
2.2 Install docker-compose & ruby
$ sudo yum install git docker-compose ruby
Note: docker-compose and ruby are required by docker-hadoop.sh
3 Build & Smoke Testing of Bigtop
3.1 Source Code Downloading
3.1.1 Bigtop Official Release 1.3.0
Refer:
https://github.com/apache/bigtop
$ git clone https://github.com/apache/bigtop.git
$ cd bigtop/
$ git checkout -b working-rel/1.3.0 rel/1.3.0
3.2 Start a build container
Before starting the container, give other users `w` access to `bigtop` home directory. It is required for gradle installation as 'jenkins' users. Otherwise, you will see this error when run 'gradlew tasks'. FAILED: Could not create service of type CrossBuildFileHashCache using BuildSessionScopeServices.createCrossBuildFileHashCache().
$ cd ..
$ chmod a+w bigtop/
$ cd bigtop
Now, you can start the container based on Image: "bigtop/slaves:1.3.0-centos-7-aarch64".
$ docker run -it --rm -u jenkins --workdir /ws -v `pwd`:/ws bigtop/slaves:1.3.0-centos-7-aarch64 bash -l
Note:
User 'jenkins' is employed. It exists by default in the root docker image of centos 7.0.
It's not allowed using 'root' to build bigtop. Some component refuses to be built in root.
Image "bigtop/slaves:1.3.0-centos-7-aarch64" will be retrieved from docker hub on live.
3.3 Env. setup
[jenkins@eb7597605841 ws] $ . /etc/profile.d/bigtop.sh
Note: bigtop.sh sets environments variables such as: JAVA_HOME, MAVEN_HOME, ANT_HOME, GRADLE_HOME, etc.
$ cd /ws
$ ./gradlew tasks
Note:
This will initiate gradle installation in current docker container.
Using $ ./gradlew tasks | grep "\-rpm" to see all supported components.
alluxio-rpm - Building RPM for alluxio artifacts
ambari-rpm - Building RPM for ambari artifacts
apex-rpm - Building RPM for apex artifacts
bigtop-groovy-rpm - Building RPM for bigtop-groovy artifacts
bigtop-jsvc-rpm - Building RPM for bigtop-jsvc artifacts
bigtop-tomcat-rpm - Building RPM for bigtop-tomcat artifacts
bigtop-utils-rpm - Building RPM for bigtop-utils artifacts
crunch-rpm - Building RPM for crunch artifacts
datafu-rpm - Building RPM for datafu artifacts
flink-rpm - Building RPM for flink artifacts
flume-rpm - Building RPM for flume artifacts
giraph-rpm - Building RPM for giraph artifacts
gpdb-rpm - Building RPM for gpdb artifacts
hadoop-rpm - Building RPM for hadoop artifacts
hama-rpm - Building RPM for hama artifacts
hbase-rpm - Building RPM for hbase artifacts
hive-rpm - Building RPM for hive artifacts
ignite-hadoop-rpm - Building RPM for ignite-hadoop artifacts
kafka-rpm - Building RPM for kafka artifacts
mahout-rpm - Building RPM for mahout artifacts
phoenix-rpm - Building RPM for phoenix artifacts
qfs-rpm - Building RPM for qfs artifacts
solr-rpm - Building RPM for solr artifacts
spark-rpm - Building RPM for spark artifacts
spark1-rpm - Building RPM for spark1 artifacts
sqoop-rpm - Building RPM for sqoop artifacts
sqoop2-rpm - Building RPM for sqoop2 artifacts
tajo-rpm - Building RPM for tajo artifacts
tez-rpm - Building RPM for tez artifacts
ycsb-rpm - Building RPM for ycsb artifacts
zeppelin-rpm - Building RPM for zeppelin artifacts
zookeeper-rpm - Building RPM for zookeeper artifacts
Totally 32 components.
3.4 Build rpm, yum, and repo
# ./gradlew allclean
# ./gradlew rpm
Note: rpm - Build all RPM packages for the stack
# ./gradlew yum
Note: yum - Creating YUM repository
# ./gradlew repo
Note:
repo - Invoking a native repository target yum
It equals to 'createrepo ...'.
This command creates ./repodata folder under [bigtop]/output. 'repodata' directory holds the metadata information for the newly created repository.
3.5 Deploy & Smoke Test w/ Docker
Bigtop uses docker as an easy way to deploy multi-node cluster and to do smoke tests. Here is how. To start with, you need a .yaml config file.
3.5.1 Yaml config file contents:
In [bigtop]/provisioner/docker/working-erp-18.06_centos-7.yaml
docker:
memory_limit: "16g"
image: "bigtop/puppet:1.3.0-centos-7-aarch64"
repo: "file:///bigtop-home/output"
distro: centos
components: [hdfs, yarn, mapreduce, zookeeper, hbase, hive, spark]
enable_local_repo: false
smoke_test_components: [hdfs,spark]
3.5.2 Deploy w/ docker containers
$ cd provisioner/docker/
$ ./docker-hadoop.sh -C working-erp-18.06_centos-7.yaml -c 5
- pass, in about 10 minutes.
This step goes pretty quick. It doesn't download anything from external network.
3.5.3 Smoke tests
Edit config file (working-erp-18.06_centos-7.yaml) to set which components to smoke test:
Eg.
smoke_test_components: [hdfs,spark]
$ ./docker-hadoop.sh -C working-erp-18.06_centos-7.yaml -s
During smoke test, it downloads ....
3.5.3.1 Analysis why smoke test downloads:
Test framework first evaluate which tasks need to be performed in order to fulfill the user's smoke test request. Then, it starts './gradlew ' to execute these tasks.
Among the tasks for hdfs smoke-test, these two depends on external network downloading:
task ':bigtop-tests:smoke-tests:hdfs:compileJava'
- download ... pom, ..jar
task ':bigtop-tests:smoke-tests:hdfs:compileTestGroovy'
- download ... pom, ..jar
These tasks are only one-time needed. When run hdfs smoke-test the second time, the above downloading doesn't happen any more.
4 Local Yum Repository Setup
This is to set up an HTTP file server to publish Bigtop build results. In later steps, when deploying, it will be specified as the repo URI where puppet can download Bigtop binaries from.
4.1 Set Hostname
Bigtop configuration requires FQDN for each machine in the cluster. Name D05 servers in the following rules:
d05-<%03d>bigtop.deploy
Eg.
d05-001.bigtop.deploy
d05-002.bigtop.deploy
…
In the following paragraphs, d05-001 will be used as master node. The others will be used as slave(s).
To Set FQDN on each machine, do the following:
Set FQDN for the machine
$ sudo hostnamectl set-hostname d05-001.bigtop.deploy
$ sudo hostname
Update /etc/hosts
$ sudo vi /etc/hosts
-append these lines:
192.168.10.141 d05-001.bigtop.deploy d05-001
192.168.10.177 d05-002.bigtop.deploy d05-002
192.168.10.162 d05-003.bigtop.deploy d05-003
192.168.10.221 d05-003.bigtop.deploy d05-004
192.168.10.248 d05-003.bigtop.deploy d05-005
192.168.10.155 d05-003.bigtop.deploy d05-006
4.2 Setup Local Nginx HTTP Server
Ref to Step 1 in: https://www.tecmint.com/setup-local-http-yum-repository-on-centos-7/
# yum install epel-release
# yum install nginx
# systemctl start nginx
# systemctl enable nginx
# systemctl status nginx
# firewall-cmd --zone=public --permanent --add-service=http
# firewall-cmd --zone=public --permanent --add-service=https
# firewall-cmd --reload
Note: now you can open another machine, and verify this by "wget http://d05-001.bigtop.deploy"
4.3 Publish Bigtop /Output through Nginx
# mkdir -p /usr/share/nginx/html/releases/1.3.0/centos/1/aarch64
Note: 1 is the build number. It should be changed accordingly.;
# cd /usr/share/nginx/html/releases/1.3.0/centos/1/aarch64
# rsync -a --delete /home/guodong/bigtop/output/ .
# vi /etc/nginx/nginx.conf
(Note: insert blue lines into "http -> server" section)
http {
...
server {
…
root /usr/share/nginx/html;
…
location /releases/ {
autoindex on; #enable listing of directory index
}
…
location / {
}
# systemctl restart nginx
Note: now you can open another machine, and verify this by "wget http://d05-001.bigtop.deploy/releases/1.3.0/centos/1/aarch64"
5 Deploy Bigtop on Multiple Nodes
Ref: https://github.com/apache/bigtop/blob/master/bigtop-deploy/puppet/README.md
Learn puppet and hiera as well.
5.1 Deploy on master node
5.1.1 Set Hostname
Refer to Set Hostname(s).
5.1.2 Disable Firewall
Bigtop components (hdfs, yarn, etc.) use a lot of port for receiving services and connections between nodes. To make them work well, it's better to disable firewall so all ports are accessed through.
Method to disable firewall on CentOS 7, please refer: https://linuxize.com/post/how-to-stop-and-disable-firewalld-on-centos-7/
$ sudo systemctl stop firewalld
$ sudo systemctl disable firewalld
$ sudo systemctl mask --now firewalld
Note: (NOT RECOMMEND) An alternative way is to open each and every ports specifically. Difficulty of doing this is not easy to list all ports completely.
To enable a specific port, Eg:
The following ports need to be open on master node.
8020: hadoop
8032: yarn
$ sudo firewall-cmd --zone=public --permanent --add-port=8020/tcp
$ sudo firewall-cmd --zone=public --permanent --add-port=8032/tcp
$ sudo firewall-cmd --reload
5.1.3 Preparation
Install openJDK 8 JAVA: https://openjdk.java.net/install/
$ sudo yum -y install java-1.8.0-openjdk
$ java -version
openjdk version "1.8.0_191"
Install unzip curl (it's required by gradle installation)
$ sudo yum -y install unzip curl
Install puppet and puppetlabs-stdlib
Refer to: https://github.com/apache/bigtop/blob/master/bigtop_toolchain/bin/puppetize.sh
$ sudo rpm -ivh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
$ sudo yum updateinfo
# BIGTOP-3088: pin puppetlabs-stdlib to 4.12.0 as the one provided by
# distro (4.25.0) has conflict with puppet<4. Should be removed once
# puppet in distro is updated.
$ sudo yum -y install hostname curl sudo unzip wget puppet
$ sudo puppet module install puppetlabs-stdlib --version 4.12.0
Install puppet-modules (depends on puppet)
$ sudo ./gradlew toolchain-puppetmodules
BUILD SUCCESSFUL in 2m 18s
5.1.4 Copy hiera and hieradata to /etc for puppet
$ cd ~/bigtop/
$ sudo cp bigtop-deploy/puppet/hiera.yaml /etc/puppet
$ sudo mkdir -p /etc/puppet/hieradata
$ sudo rsync -a --delete bigtop-deploy/puppet/hieradata/site.yaml bigtop-deploy/puppet/hieradata/bigtop /etc/puppet/hieradata/
5.1.5 Edit /etc/puppet/hieradata/site.yaml
$ sudo vi /etc/puppet/hieradata/site.yaml
bigtop::hadoop_head_node: "d05-001.bigtop.deploy"
hadoop::hadoop_storage_dirs:
- /mnt/sda2
- /mnt/sdc1-1
- /mnt/sdd1
- /mnt/sde1
- /mnt/sdf1
- /mnt/sdg1
- /mnt/sdh1
- /mnt/sdi1
- /mnt/sdc1
- /mnt/sdk1
- /mnt/sdl1
hadoop_cluster_node::cluster_components:
- hdfs
- yarn
- mapreduce
- zookeeper
- kafka
- hbase
- hive
- spark
- flink
bigtop::bigtop_repo_uri: "http://d05-001.bigtop.deploy/releases/1.3.0/centos/2/aarch64"
Major parameters are explained below.
5.1.5.1 hadoop::hadoop_storage_dirs:
These are folders (physical drives) which allocated to HDFS. You want to give more physical drives to HDFS as possible to increase I/O parallelism. Two things need to do,
To release physical disks from LVM, refer to here.
To format newly added physical drives and mount them as XFS filesystem, here is a script to help.
5.1.5.2 bigtop::bigtop_repo_uri:
This is the URL where to retrieve Bigtop build artifacts. There are three major ways regarding where Bigtop artifacts can be retrieved.
Use Bigtop official release URL. Ref: https://www.apache.org/dyn/closer.lua/bigtop/bigtop-1.3.0/repos/
Eg. for CentOS, download bigtop.repo and find 'baseurl=...'.
Create an offline Bigtop release repository by downloading all the bigtop repositories to local, using 'reposync' command.
This is useful when you want to install bigtop when there is no internet access and you cannot build from source either.
Details how, please check Section here.
Build from source, then publish via Nginx. See sections above: this and this.
5.1.5.3 hadoop_cluster_node::cluster_components:
This is the list of which components to install on this node. Just component names are required, no need to care about roles.
'puppet apply' script will figure out proper roles.
5.1.5.4 bigtop::hadoop_head_node:
Head node (aka. master node) is specified here. 'puppet apply' use this to decide which roles should be launched on which nodes.
5.1.6 Deploy using puppet
$ cd ~/bigtop/
$ sudo puppet apply -d --parser future --modulepath="bigtop-deploy/puppet/modules:/etc/puppet/modules" bigtop-deploy/puppet/manifests
Notice: Roles to deploy: [resourcemanager, nodemanager, mapred-app, hadoop-client, zookeeper-server, zookeeper-client, kafka-server, hbase-master, hbase-server, hbase-client, hive-server2, hive-metastore, hive-client, spark-on-yarn, spark-yarn-slave, spark-client, flink-jobmanager, flink-taskmanager, namenode, datanode]
... ...
Notice: Finished catalog run in 663.71 seconds
To confirm the installation is correct:
$ sudo jps
21184 Jps
2560 NodeManager
2673 JobHistoryServer
11461 RunJar
11733 RunJar
2645 ThriftServer
2678 NameNode
2567 ResourceManager
2568 WebAppProxyServer
2680 DataNode
2458 QuorumPeerMain
2767 HRegionServer
5.1.7 Issues met during second deploy
Log:
Debug: Executing '/usr/bin/sudo -u hive /usr/lib/hive/bin/schematool -dbType derby -initSchema'
Fix:
$ sudo rm -rf /var/lib/hive/metastore/metastore_db
5.2 Deploy on slave nodes
5.2.1 Clone source code
Note: this is for downloading [bigtop]/deploy related code. Not for building.
$ git clone https://github.com/apache/bigtop.git
$ cd bigtop/
$ git checkout -b working-rel/1.3.0 rel/1.3.0
5.2.2 Disable Firewall
Refer to Disable Firewall.
5.2.3 Set Hostname and Update /etc/hosts
Please refer to Set Hostname(s).
Note, /etc/hosts need to be updated. This is to allow slave node to reach master node. and vice versa. Remember to update master node's /etc/hosts too.
5.2.4 Repeat Same Steps as master node
Repeat same steps as in Deploy on master node.
Logs when running `puppet apply`,
Notice: Roles to deploy: [nodemanager, mapred-app, zookeeper-server, hbase-server, spark-on-yarn, spark-yarn-slave, datanode]
...
Notice: Finished catalog run in 363.87 seconds
, which usually means success.
To confirm the deployment:
$ sudo jps
12306 Jps
11860 NodeManager
1949 QuorumPeerMain
1981 DataNode
5.2.5 Confirm Node(s) identified by Master
Run these commands to confirm that the newly deployed slave nodes are registered well on master.
$ hdfs dfsadmin -printTopology
or,
$ hdfs dfsadmin -report
Live datanodes (3):
Note: the number of datanodes, and their details such as IP, Hostname:Port.
$ yarn node -list
Total Nodes:3
Note: the number of nodes, and their details such as Hostname:Port.