Table of Contents |
---|
Introduction
...
- Drill distribution archive: The original .tar.gz file for your Drill distribution. DrillonYARN uploads this archive to your distributed file system (DFS). YARN downloads it (localized it) to each worker node.
- Drill site directory: A directory that contains your Drill configuration and custom jar files. DrillonYARN copies this directory to each worker node.
- Configuration: A configuration file which tells DrillonYARN how to manage your Drill cluster. This file is separate from your configuration files for Drill itself.
- DrillonYARN client: A command line program to start, stop and monitor your YARN-managed Drill cluster.
- Drill Application Master (AM): The software that works with YARN to request resources, launch Drillbits, and so on. The AM provides a web UI to manage your Drill cluster.
- Drillbit: The Drill daemon software that YARN runs on each node.
Steps of creating a
...
Drill-Yarn Cluster
...
Create a Master Directory
...
Code Block | ||
---|---|---|
| ||
export MASTER_DIR=/path/to/master/dir mkdir $MASTER_DIR cd $MASTER_DIR |
Unpack the archive to create $DRILL_HOME. - Create the site directory with the required configuration files.
Install Drill
Follow the Drill Arm64 install directions to install Drill on your client host:
1.Select a Drill version. The name is used in multiple places below. For convenience, define an environment variable for the name:
...
Code Block | ||
---|---|---|
| ||
tar -xzf $DRILL_NAME.tar.gz |
3. For ease of following the remaining steps, call your expanded Drill folder $DRILL_HOME :
Code Block | ||
---|---|---|
| ||
export DRILL_HOME=$MASTER_DIR/$DRILL_NAME |
Your master directory should now contain the original Drill archive along with an expanded copy of that archive.
Create the Site Directory
The site directory contains your site-specific files for Drill. If you are converting an existing Drill install, see the “Site Directory” section.
...
Code Block | ||
---|---|---|
| ||
export DRILL_SITE=$MASTER_DIR/site mkdir $DRILL_SITE |
When you do a fresh install, Drill includes a conf directory under $DRILL_HOME. Use the files in that directory to create your site directory.
Code Block | ||
---|---|---|
| ||
cp $DRILL_HOME/conf/drill-override-example.conf $DRILL_SITE/drill-override.conf cp $DRILL_HOME/conf/drill-on-yarn-example.conf $DRILL_SITE/drill-on-yarn.conf cp $DRILL_HOME/conf/drill-env.sh $DRILL_SITE |
Drill Resource Configuration
Drill-on-YARN uses a different mechanism to set these values. You set the values in drill-on-yarn.conf ,
then Drill-on-YARN copies the values into the environment variables when launching each Drillbit.
drill-override.conf:
Code Block | ||
---|---|---|
| ||
drill.exec: { cluster-id: "drillbits1" zk: { connect: "node1:2181,node2:2181,node3:2181", root: "drill", refresh: 500, timeout: 5000, retry: { count: 7200, delay: 500 } } } |
drill-on-yarn.conf:
Code Block | ||
---|---|---|
| ||
drill.yarn: { app-name: "Drill-on-YARN" dfs: { connection: "hdfs://node1:9000/" app-dir: "hdfs://node1:9000/users/drill" } yarn: { queue: "default" } drill-install: { client-path: "/home/admin/drill/apache-drill-1.15.0.tar.gz" } am: { heap: "450M" memory-mb: 512 } http: { port: 12345 auth-type: "simple" user-ame: "admin" password: "admin" rest-key="" } drillbit: { heap: "3G" max-direct-memory: "1G" code-cache: "1G" memory-mb: 4096 vcores: 2 # disks: 3 classpath: "" } cluster: [ { name: "drill-group1" type: "basic" count: 3 } ] } |
...
- Label configuration is disabled in Yarn, so we also set field 'type' to basic, not label in drill-on-yarn.conf.
- The default value user_name is not correct in drill-on-yarn.conf. It should be modified to 'user-name'
- 'app-dir' should be a absolute path like 'hdfs://node1:9000/users/drill' in drill-on-yarn.conf. It is noted that the doc from Apache-Drill is not correct.
We should disable the 'vmem-check' in yarn-site.xml:
Code Block language bash <property> <name>yarn.nodemanager.vmem-check-enabled</name> <value>false</value> </property>
...
Code Block | ||
---|---|---|
| ||
$HADOOP_HOME/etc/hadoop |
Hadoop and Drill environment variables list
Code Block | ||
---|---|---|
| ||
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-arm64 export HADOOP_HOME=/usr/lib/hadoop-2.8.4 export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop export HADOOP_YARN_HOME=/usr/lib/hadoop-2.8.4 export YARN_CONF_DIR=$HADOOP_YARN_HOME/etc/hadoop export PATH=$HADOOP_HOME/bin:$JAVA_HOME/bin:$PATH export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$HADOOP_HOME/bin export MASTER_DIR=/home/linaro/drill-setup/drill-master export DRILL_HOME=$MASTER_DIR/apache-drill-1.15.0 export DRILL_SITE=$MASTER_DIR/site export PROD_DRILL_HOME=/home/linaro/drill-setup/drill/distribution/target/apache-drill-1.15.0/apache-drill-1.15.0 |
...
Code Block | ||
---|---|---|
| ||
$DRILL_HOME/bin/drill-on-yarn.sh --site $DRILL_SITE start |