Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

...

<In-Progress>

Pre-requisites

  • OpenJDK8
  • Zookeeper
  • git
  • maven@v3.3.9

Install OpenJDK

$
sudo apt-get install
 sudo apt-get install openjdk-8-jdk


Make sure you have the right OpenJDK version 

$ java -version

It should display 1.8.0_111

Set JAVA_HOME

$ export JAVA_HOME=`readlink -f /usr/bin/java sed "s:jre/bin/java::"`


Building Apache Zookeeper

...

Some distributions like Ubuntu/Debian comes with latest zookeeper.  Hence you can just install using apt-get command "sudo apt-get install zookeeper".  If your distribution does not come with zookeeper then just go for latest download and unzip the Zookeeper package from Official Apache archive in all machines that will be used for zookeeper quorum as shown below:


Edit the /etc/hosts file across all the nodes and add the ipaddress and hostname (nodenames). If the hostnames are not right, change them in /etc/hosts file



192.168.1.102 node1
192.168.1.103 node2
192.168.1.105 node3


Create zookeeper user

$ sudo adduser zookeeper


Configure zookeeper

Now, Create the To make an ensemble with Master-slave architecture,  we needed to have odd number of zookeeper server .i.e.{1, 3 ,5,7....etc}. 

Now, Create the directory zookeeper under /var/lib folder which will serve as Zookeeper data directory and create another zookeeper directory under /var/log where all the Zookeeper logs will be captured. Both of the directory ownership need to be changed as zookeeper.

$ sudo mkdir /var/lib/zookeeper

$ cd /var/lib

$ sudo chown zookeeper:zookeeper zookeeper/

$ sudo mkdir /var/log/zookeeper

$ cd /var/log

$ sudo chown zookeeper:zookeeper zookeeper/


Note: While running the zookeeper if you get a message something like below you may need to check/change for permissions of the files under /var/lib/zookeeper and /var/log/zookeeper.

Since I have loged-in as linaro and running zookeeper.  I have changed the permission to linaro user.


linaro@node1:~/drill-setup/zookeeper-3.4.12$ ./bin/zkServer.sh start

ZooKeeper JMX enabled by default

Using config: /home/linaro/drill-setup/zookeeper-3.4.12/bin/../conf/zoo.cfg

Starting zookeeper ... ./bin/zkServer.sh: line 149: /var/lib/zookeeper/zookeeper_server.pid: Permission denied

FAILED TO WRITE PID


Edit the bashrc for the zookeeper user via setting up the following Zookeeper environment variables.

$ export ZOO_LOG_DIR=/var/log/zookeeper


Source the .bashrc in current login session:

$ source ~/.bashrc


Create the server id for the ensemble. Each zookeeper server should have a unique number in the myid file within the ensemble and should have a value between 1 and 255.

In Node1

$ sudo sh -c "echo '1' > /var/lib/zookeeper/myid"


In Node2

$ sudo sh -c "echo '2' > /var/lib/zookeeper/myid"


In Node3

$ sudo sh -c "echo '3' > /var/lib/zookeeper/myid"


Now, go to the conf folder under the Zookeeper home directory (location of the Zookeeper directory after Archive has been unzipped/extracted).

$ cd /home/zookeeper/zookeeper-3.4.13/conf/


By default, a sample conf file with name 
zoo_sample.cfg will be present in conf directory. Make a copy of it with name zoo.cfg as shown below, and edit new zoo.cfg as described across all the nodes.



$ cp zoo_sample.cfg zoo.cfg


Edit zoo.cfg and the below


$ vi zoo.cfg


dataDir=/var/lib/zookeeper
server.1=node1:2888:3888
server.2=node2:2888:3888
server.3=node3:2888:3888


Now, do the below changes in 
log4.properties file as follows.

$ vi log4j.properties


zookeeper.log.dir=/var/log/zookeeper 
zookeeper.tracelog.dir=/var/log/zookeeper 
log4j.rootLogger=INFO, CONSOLE, ROLLINGFILE


After the configuration has been done in zoo.cfg file in all three nodes, start zookeeper in all the nodes one by one, using following command:

$ /home/zookeeper/zookeeper-3.4.12/bin/zkServer.sh start


Zookeeper Service Start on all the Nodes.

ZooKeeper JMX enabled by default
Using config: /home/ganesh/zookeeper-3.4.12/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED


The log file will be created in /var/log/zookeeper of zookeeper named zookeeper.log, tail the file to see logs for any errors.

$ tail -f /var/log/zookeeper/zookeeper.log


Verify the Zookeeper Cluster and Ensemble

In Zookeeper ensemble out of three servers, one will be in leader mode and other two will be in follower mode. You can check the status by running the following commands.

$ /home/zookeeper/zookeeper-3.4.13/bin/zkServer.sh status


Zookeeper Service Status Check.

In Zookeeper ensemble If you have 3 nodes, out of them, one will be in leader mode and other two will be in follower mode. You can check the status by running the following commands. If you have just one then it will be standalone.

With three nodes:

node1

ZooKeeper JMX enabled by default
Using config: /home/zookeeper/zookeeper-3.4.12/bin/../conf/zoo.cfg
Mode: leader

node2

ZooKeeper JMX enabled by default
Using config: /home/zookeeper/zookeeper-3.4.12/bin/../conf/zoo.cfg
Mode: follower

node3

ZooKeeper JMX enabled by default
Using config: /home/zookeeper/zookeeper-3.4.12/bin/../conf/zoo.cfg
Mode: follower


standalone

ZooKeeper JMX enabled by default
Using config: /home/zookeeper/zookeeper-3.4.12/bin/../conf/zoo.cfg
Mode: standalone


$ echo stat | nc node1 2181


Lists brief details for the server and connected clients.

Lists brief details for the server and connected clients


$ echo mntr | nc node1 2181


Zookeeper list of variables for cluster health monitoring.

Zookeeper list of variables for cluster health monitoring


$ echo srvr | nc localhost 2181


Lists full details for the Zookeeper server.

...

If you need to check and see the znode, you can connect by using the below command on any of the zookeeper node:

$ /home/zookeeper/zookeeper-3.4.12/bin/zkCli.sh -server `hostname -f`:2181


Connect to Zookeeper data node and lists the contents.

...

WatchedEvent state:SyncConnected type:None path:null
[zk: :2181(CONNECTED) 0]


Install Pre-

...

requisites for Build

$ sudo apt-get install git

...


Setup environment

Add environment variables to profile file

# setup environments
export LANG="en_US.UTF-8"
export PATH=${HOME}/gradle/bin:$PATH
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-arm64
export JAVA_TOOL_OPTIONS="-Dfile.encoding=UTF8"
 


$ source ~/.bashrc


Hooking up upstream Maven 3.6.0 (for Debian Jessie only)

$

wget 

wget http://

mirror

mirrors.

ox.ac.uk/sites/rsync.apache.org

gigenet.com/apache/maven/maven-3/3.

3

6.

9

0/binaries/apache-maven-3.

3

6.

9

0-bin.tar.gz

 

$ tar xvf apache-maven-3.

3

6.

9

0-bin.tar.gz 

$ cd apache-maven-3.

3

6.

9

0/bin 

$ export PATH=$PWD:$PATH

$ mvn --version # should list the version as 3.6.0

Clone and Build Apache Drill

$ git clone https://gitbox.apache.org/repos/asf/drill.git

$ cd drill

$ git branch v1.15.0 origin/1.15.0

$ git checkout v1.15.0


To build .deb package 

$ mvn clean -X package -Pdeb -DskipTests


To build .rpm package 

$ mvn clean -X package -Prpm -DskipTests


After successful compilation.

...

$ cd distribution/target/apache-drill-1.15.0/apache-drill-1.15.0

...

 Edit your computer /etc/hosts file and make sure that the

...

loopback is commented. e.g. and replace with your host <IP-Address>

$ cd distribution/target/apache-drill-1.15.0/apache-drill-1.15.0


#127.0.0.1 localhost

#127.0.1.1 ubuntu

replace with your host IP-address


<IP-address> ubuntu
<IP-address> localhost


Because in distributed mode the loopback IP 127.0.1.1 cannot be binded reference https://stackoverflow.com/questions/40506221/how-to-start-drillbit-locally-in-distributed-mode

Next you need to edit the conf/drill-override.conf and change the zookeeper cluster ID e.g. as below

drill.exec:

{ cluster-id: "1", zk.connect: "<IP-address>:2181" }


Now you can run the drillbit and watchout the log. To play more with drillbit you can refer drill-override-example.conf file.

$ apache-drill-1.15.0$ ./bin/drillbit.sh help
Usage: drillbit.

sh

sh [--config|--site <site-dir>] (start|stop|status|restart|run|graceful_stop) [args]


In one of the terminal switch on the logs with the tail command

$ apache-drill-1.15.0$ tail -f log/drillbit.log

$ apache-drill-1.15.0$ ./bin/drillbit.sh start


Starting drillbit, logging to /mnt/nvme0n1p3/Projects/Apache-Components-Build/drill/distribution/target/apache-drill-1.15.0/apache-drill-1.15.0/log/drillbit.out

$ apache-drill-1.15.0$ ./bin/drillbit.sh status

drillbit is running.


$ apache-drill-1.15.0$ ./bin/drillbit.sh graceful_stop
Stopping drillbit
...


You can either stop or do a graceful stop. We can repeat the same steps on more than one machines (nodes).

I could able to run the Drill and access the http://IP-Address:8047 and run a sample querry in distributed mode. So In order to do in a distributed mode. I just need to do a similar setup on multiple machines (nodes). Reference - https://drill.apache.org/docs

...

/starting-the-web-ui/


If you are using the CentOS 7   you should be little careful because the connection errors may be caused because of the firewall issues. I have used below set of commands to disable the firewall.

[centos@centos ~]

$ sudo systemctl stop firewalld

[centos@centos ~]$
[centos@centos ~]


$ sudo firewall-cmd --zone=public --add-port=2181/udp --add-port=2181/tcp --permanent
[sudo] password for centos:
success

[centos@centos ~]


$


[centos@centos ~]$

sudo firewall-cmd --reload
success

[centos@centos ~]$
[centos@centos ~]


$ zkServer.sh restart
ZooKeeper JMX enabled by default
Using config: /home/centos/zookeeper-3.4.12/bin/../conf/zoo.cfg
ZooKeeper JMX enabled by default
Using config: /home/centos/zookeeper-3.4.12/bin/../conf/zoo.cfg
Stopping zookeeper ... STOPPED
ZooKeeper JMX enabled by default
Using config: /home/centos/zookeeper-3.4.12/bin/../conf/zoo.cfg
Starting zookeeper ... STARTED


[centos@centos ~]$


REFERENCE:

https://stackoverflow.com/questions/13316776/zookeeper-connection-error

https://www.tutorialspoint.com/zookeeper/index.htm

https://blog.redbranch.net/2018/04/19/zookeeper-install-on-centos-7/

https://drill.apache.org/docs/distributed-mode-prerequisites/