Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


This is report about how Bigtop should be build, smoketest and deploy. Although the work of this report is mainly done  Arm Server and CentOS 7.5, it is truly usable on x86 servers too. 

Bigtop 1.3.0 is used in this report.

Table of Contents


1 CentOS 7.5 Install

Note: This section is machine specific. For the rest of this document, commands and environment for reference is CentOS 7.5.


2 Install docker & git & docker-compose & ruby

2.1 Install docker for CentOS

Refer to steps officially in: https://docs.docker.com/install/linux/docker-ce/centos/#set-up-the-repository

$ sudo systemctl start docker

...

Note: Remember to log out and back in for this to take effect!

2.2 Install docker-compose & ruby

$ sudo yum install git docker-compose ruby

Note: docker-compose and ruby are required by docker-hadoop.sh

3 Build & Smoke Testing of Bigtop

3.1 Source Code Downloading


3.1.1 Bigtop Official Release 1.3.0

Refer:

https://github.com/apache/bigtop

$ git clone https://github.com/apache/bigtop.git

...

$ git checkout -b working-rel/1.3.0 rel/1.3.0

3.2 Start a build container

Before starting the container, give other users `w` access to `bigtop` home directory. It is required for gradle installation as 'jenkins' users. Otherwise, you will see this error when run 'gradlew tasks'. FAILED: Could not create service of type CrossBuildFileHashCache using BuildSessionScopeServices.createCrossBuildFileHashCache().

...

  1. User 'jenkins' is employed. It exists by default in the root docker image of centos 7.0.

  2. It's not allowed using 'root' to build bigtop. Some component refuses to be built in root.

  3. Image "bigtop/slaves:1.3.0-centos-7-aarch64" will be retrieved from docker hub on live.

3.3 Env. setup

[jenkins@eb7597605841 ws] $ . /etc/profile.d/bigtop.sh

...

zookeeper-rpm - Building RPM for zookeeper artifacts

Totally 32 components.

3.4 Build rpm, yum, and repo

# ./gradlew allclean

# ./gradlew rpm

...

  • repo - Invoking a native repository target yum

  • It equals to 'createrepo ...'.

  • This command creates ./repodata folder under [bigtop]/output. 'repodata' directory holds the metadata information for the newly created repository.


3.5 Deploy & Smoke Test w/ Docker

Bigtop uses docker as an easy way to deploy multi-node cluster and to do smoke tests. Here is how. To start with, you need a .yaml config file.

3.5.1 Yaml config file contents:

In [bigtop]/provisioner/docker/working-erp-18.06_centos-7.yaml

...

smoke_test_components: [hdfs,spark]

3.5.2 Deploy w/ docker containers

$ cd provisioner/docker/

$ ./docker-hadoop.sh -C working-erp-18.06_centos-7.yaml -c 5

...

This step goes pretty quick. It doesn't download anything from external network.

3.5.3 Smoke tests

Edit config file (working-erp-18.06_centos-7.yaml) to set which components to smoke test:

...

During smoke test, it downloads ....

3.5.3.1 Analysis why smoke test downloads:

Test framework first evaluate which tasks need to be performed in order to fulfill the user's smoke test request. Then, it starts './gradlew ' to execute these tasks.

...

These tasks are only one-time needed. When run hdfs smoke-test the second time, the above downloading doesn't happen any more.

4 Local Yum Repository Setup

This is to set up an HTTP file server to publish Bigtop build results. In later steps, when deploying, it will be specified as the repo URI where puppet can download Bigtop binaries from.

4.1 Set Hostname

Bigtop configuration requires FQDN for each machine in the cluster. Name D05 servers in the following rules:

...

192.168.10.155 d05-003.bigtop.deploy d05-006


4.2 Setup Local Nginx HTTP Server

Ref to Step 1 in: https://www.tecmint.com/setup-local-http-yum-repository-on-centos-7/

# yum install epel-release

...

Note: now you can open another machine, and verify this by "wget http://d05-001.bigtop.deploy"

4.3 Publish Bigtop /Output through Nginx


# mkdir -p /usr/share/nginx/html/releases/1.3.0/centos/1/aarch64

...

Note: now you can open another machine, and verify this by "wget http://d05-001.bigtop.deploy/releases/1.3.0/centos/1/aarch64"

5 Deploy Bigtop on Multiple Nodes

Ref: https://github.com/apache/bigtop/blob/master/bigtop-deploy/puppet/README.md

Learn puppet and hiera as well.

5.1 Deploy on master node

5.1.1 Set Hostname

Refer to Set Hostname(s).

5.1.2 Disable Firewall

Bigtop components (hdfs, yarn, etc.) use a lot of port for receiving services and connections between nodes. To make them work well, it's better to disable firewall so all ports are accessed through.

Method to disable firewall on CentOS 7, please refer: https://linuxize.com/post/how-to-stop-and-disable-firewalld-on-centos-7/

$ sudo systemctl stop firewalld

...

$ sudo firewall-cmd --reload

5.1.3 Preparation

$ sudo yum -y install java-1.8.0-openjdk

...

  • Install puppet and puppetlabs-stdlib

Refer to: https://github.com/apache/bigtop/blob/master/bigtop_toolchain/bin/puppetize.sh

$ sudo rpm -ivh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm

$ sudo yum updateinfo

      # BIGTOP-3088: pin puppetlabs-stdlib to 4.12.0 as the one provided by

...

BUILD SUCCESSFUL in 2m 18s

5.1.4 Copy hiera and hieradata to /etc for puppet

$ cd ~/bigtop/
$ sudo cp bigtop-deploy/puppet/hiera.yaml /etc/puppet
$ sudo mkdir -p /etc/puppet/hieradata
$ sudo rsync -a --delete bigtop-deploy/puppet/hieradata/site.yaml bigtop-deploy/puppet/hieradata/bigtop /etc/puppet/hieradata/

5.1.5 Edit /etc/puppet/hieradata/site.yaml


$ sudo vi /etc/puppet/hieradata/site.yaml

...

Major parameters are explained below.

5.1.5.1 hadoop::hadoop_storage_dirs:

These are folders (physical drives) which allocated to HDFS. You want to give more physical drives to HDFS as possible to increase I/O parallelism. Two things need to do,

  1. To release physical disks from LVM, refer to here.

  2. To format newly added physical drives and mount them as XFS filesystem, here is a script to help.

5.1.5.2 bigtop::bigtop_repo_uri:

This is the URL where to retrieve Bigtop build artifacts. There are three major ways regarding where Bigtop artifacts can be retrieved.

  1. Use Bigtop official release URL. Ref: https://www.apache.org/dyn/closer.lua/bigtop/bigtop-1.3.0/repos/

    1. Eg. for CentOS, download bigtop.repo and find 'baseurl=...'.

  2. Create an offline Bigtop release repository by downloading all the bigtop repositories to local, using 'reposync' command.

    1. This is useful when you want to install bigtop when there is no internet access and you cannot build from source either.

    2. Details how, please check Section here.

  3. Build from source, then publish via Nginx. See sections above: this and this.

5.1.5.3 hadoop_cluster_node::cluster_components:

This is the list of which components to install on this node. Just component names are required, no need to care about roles.

'puppet apply' script will figure out proper roles.

5.1.5.4 bigtop::hadoop_head_node:

Head node (aka. master node) is specified here. 'puppet apply' use this to decide which roles should be launched on which nodes.

5.1.6 Deploy using puppet


$ cd ~/bigtop/
$ sudo puppet apply -d --parser future --modulepath="bigtop-deploy/puppet/modules:/etc/puppet/modules" bigtop-deploy/puppet/manifests

...

2458 QuorumPeerMain

2767 HRegionServer


5.1.7 Issues met during second deploy


Log:

Debug: Executing '/usr/bin/sudo -u hive /usr/lib/hive/bin/schematool -dbType derby -initSchema'

Fix:

$ sudo rm -rf /var/lib/hive/metastore/metastore_db

5.2 Deploy on slave nodes

5.2.1 Clone source code

Note: this is for downloading [bigtop]/deploy related code. Not for building.

...

$ git checkout -b working-rel/1.3.0 rel/1.3.0

5.2.2 Disable Firewall

Refer to Disable Firewall.

5.2.3 Set Hostname and Update /etc/hosts

Please refer to Set Hostname(s).

Note, /etc/hosts need to be updated. This is to allow slave node to reach master node. and vice versa. Remember to update master node's /etc/hosts too.

5.2.4 Repeat Same Steps as master node

Repeat same steps as in Deploy on master node.

Logs when running `puppet apply`,

...

1949 QuorumPeerMain

1981 DataNode

5.2.5 Confirm Node(s) identified by Master

Run these commands to confirm that the newly deployed slave nodes are registered well on master.

...