How to 'Consume' a Bigtop release - to get a repository URL
1. Overview
Every Bigtop release comes with two formats:
Source code.
Binaries.
For example, in v1.3.0, there are:
The latest release of Apache Bigtop software framework
Bigtop 1.3.0 (pgp sha256 sha512)Repositories of installable binary packages built with the latest release of the Apache Bigtop project
Installable binary artifacts
This document explains how to consume these two types of release artifacts to get a Big Data binary repository URL, then using the repository URL you can create your own Big Data cluster.
Generally, there are three methods to consume a bigtop release:
directly use Bigtop's binary artifacts.
create a local mirror for your organization.
build from source code, then use it.
Method 3 looks straight forward. However, problem exists what if you want to deploy your own built result onto multiple physical machines? So far, I didn't see any clear documentation explain this from Bigtop's official website.
Method 1 and 2, are a recap of the trivial details. They may seem obvious for professionals, but are still valuable to people who are new to this art.
2. Bigtop Deploy Need a Repo URL
Last thing goes first. We do all these for just one purpose: we need a repo URL when deploying BigTop into your cluster. And for that, we need to write the bigtop deployment scripts.
The deployment scripts, need to know Repo URL. That's why.
So, let's get a closer look at deployment scripts. It comes in two scenarios:
Deploy through docker containers
"repo" field in Yaml files in <BIGTOP_ROOT>/provisioner/docker/
Eg. repo: "file:///bigtop-home/output/apt"
Eg. repo: "http://your.choice.of.hqdn/path.to.release"
Deploy onto multiple machines
Need to edit field "bigtop::bigtop_repo_uri" in /etc/puppet/hieradata/site.yaml
Eg. bigtop::bigtop_repo_uri: "http://your.choice.of.hqdn/path.to.release"
In both cases, you need a repo URL.
Note: for details on 'how to deploy Bigtop', where you will find the usage of deployment scripts, see my other blog post [1].
3. Directly use Bigtop's binary artifacts.
Follow Bigtop's release page, find this link in Installable binary artifacts:
http://apache.cs.utah.edu/bigtop/bigtop-1.3.0/repos/
As of v1.3.0 release, Bigtop provides installable package lists for the following OS/distributions:
- CentOS 7
- Debian 9
- Fedora 26
- OpenSUSE 42.3
- Ubuntu 16.04
Taking CenOS 7 as an example. Under 'centos7' folder, there is a 'bigtop.repo'. Download this to your local machine. Open it and find key/value pair:
baseurl=http://repos.bigtop.apache.org/releases/1.3.0/centos/7/$basearch
Ok, that's the URL you need when writing the deployment script.
Similar for Debian 9, follow the link above, enter folder 'debian9', and download 'bigtop.list'. Check it and get URL from the file.
deb http://repos.bigtop.apache.org/releases/1.3.0/debian/9/$(ARCH) bigtop contrib
4. Create a local mirror for your organization.
This section describes how to create an offline Bigtop release repository by downloading all the bigtop repositories to local, using 'reposync' command.
This is useful when you want to install bigtop with limited internet access or you want to share the artifact within an organization.
Here is how.
Steps:
Check the official release website to get ‘bigtop.repo'.
Create a folder locally, usually in the path of shareable folders to your web server.
Eg.
# mkdir /usr/share/nginx/html/releases/bigtop-releases-1.3.0-centos-7
Sync 'bigtop.repo' packages to local.
Eg.
# reposync -r bigtop -p /usr/share/nginx/html/releases/bigtop-releases-1.3.0-centos-7
Note: '-r bigtop', as specified in bigtop.repo, section name [bigtop].
Note: '-p /usr/share/...', folder name as created in the previous step.
That's all. As to what URL to use, that depends on your web server configuration. Remember to add the folder your created in step 2 into your web server's publish list.
Regarding how to set up a web server, there are more resources in the Internet. Below I gave an example of mine, using Nginx.
5. Use your own build from source code
This is to set up an HTTP file server to publish Bigtop build results. In later steps, when deploying, it will be specified as the repo URI where puppet can download Bigtop binaries from.
I use Nginx as an example.
In general, that involves the following steps:
Build from source, refer to my other blog post [1].
Set up a web server
Publish your build result through this web server
Here are the details.
5.1. Set Hostname
Bigtop configuration requires FQDN for each machine in the cluster. Name your cluster servers in some predefined rule. Eg. you can name them:
node-<%03d>bigtop.deploy
Eg.
node-001.bigtop.deploy
node-002.bigtop.deploy
…
To Set FQDN on CentOS, do the following (Other linux distribution may have other commands):
Set FQDN for the machine
$ sudo hostnamectl set-hostname node-001.bigtop.deploy
$ sudo hostname
Update /etc/hosts on each machine, so they can 'see' each other.
$ sudo vi /etc/hosts
-append these lines:
<IP.of.node-001> node-001.bigtop.deploy node-001
<IP.of.node-002> node-002.bigtop.deploy node-002
... ...
5.2. Setup a Nginx Web Server
Ref to Step 1 in: https://www.tecmint.com/setup-local-http-yum-repository-on-centos-7/
Here is a recap of commands I used. On node-001,
# yum install epel-release
# yum install nginx
# systemctl start nginx
# systemctl enable nginx
# systemctl status nginx
# firewall-cmd --zone=public --permanent --add-service=http
# firewall-cmd --zone=public --permanent --add-service=https
# firewall-cmd --reload
Note: now you can open another machine, and verify this by
$ wget http://node-001.bigtop.deploy
5.3. Publish Bigtop /Output through Nginx
Suppose you have done a complete bigtop build, and the result is in folder: "[your.path.of.bigtop.source.code]/output".
"/usr/share/nginx/html/" is Nginx's default root dir. So I created a sub-folder under it.
# mkdir -p /usr/share/nginx/html/releases/1.3.0/centos/1/aarch64
Note: In this example, 'releases/1.3.0/centos/1/aarch64' will become part of the final URL. Change it to whatever you want.
# cd /usr/share/nginx/html/releases/1.3.0/centos/1/aarch64
# rsync -a --delete [your.path.of.bigtop.source.code]/output/ .
# vi /etc/nginx/nginx.conf
(Note: insert blue lines into "http -> server" section)
http {
...
server {
…
root /usr/share/nginx/html;
…
location /releases/ {
autoindex on; #enable listing of directory index
}
…
location / {
}
# systemctl restart nginx
Note: now you can open another machine, and verify your setup by
$ wget http://node-001.bigtop.deploy/releases/1.3.0/centos/1/aarch64
It should download a file which lists the contents of the folder. As you can see, the URL you need when writing the deployment script is "http://node-001.bigtop.deploy/releases/1.3.0/centos/1/aarch64".