Skip to end of banner
Go to start of banner

New Lab Setup

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Version History

« Previous Version 63 Next »

This document is specific to the London RedCentric lab, but should evolve to a more generic setup once we have more labs. For now, there are some hard-coded logic in the wiki as well as the scripts, to make sure we can reproduce at least the one lab we have. Once we have more labs, we'll work to automate that using configuration files, command line options, etc.

London RedCentric

Our HPC Lab will be using the 10.40.0.0/16 network, using a VPN just for us. We will have no contact with any other lab, in or out.

The servers will receive static IP assignments in the 10.40.16.*/20 range, while the provisioner will work with IPs in the ranges:

  • 10.40.16.*/22 to dynamic
  • 10.40.20.*/22 to dynamic-reserved
  • 10.40.24.*/22 to static

The second interface in the provisioner will be in a different sub-net (via VLAN) with fixed IPs (because MrP still can't DHCP) in the range 10.41.0.0/16. This is overly restrictive considering the ranges above, but it's enough for the London data-centre (we won't have more than 250 machines in there).

The masters and benchmark machines will be provisioned by MrProvisioner and the compute nodes will be provisioned by the master (ex. warewulf, xCAT, etc).

There will be a VLAN for each cluster, to allow internal communications without flooding the rest of the lab (including other clusters), and these will be GB, 10GBE or InfiniBand, in the ranges 192.168.2.0/24192.168.3.0/24192.168.4.0/24 respectively, as each cluster can have more than one interconnect technology at the same time.

Here's a diagram of the network:

Setting up the hpc-admin node

The hpc-admin node will be the physical server hosting the MrProvisioner and Jenkins services for the HPC lab.
The baremetal installation is : a Debian9 (stretch) hosting the two services using KVM/QEMU for the moment (migration to Docker/Containers will be possible when MrP support for containers is production ready.)

Required Packages and repos

Install Debian as you normally would for a server, do care to install the ssh server and to plan for enough space for the Jenkins logs (a bare minimum of 500Go for the Jenkins VM is desirable).

The first step is to checkout our private configuration repository (you'll need to be in the hpc-sig-admin LDAP group):

root@hpc-admin # apt update && apt install -y git
root@hpc-admin # git clone ssh://git@dev-private-git.linaro.org/hpc/labconf.git
root@hpc-admin # cd ~/labconf && git submodule update --init --recursive

Once the repos are checked out, update the system and install the required packages:

root@hpc-admin # cd ~/labconf/packages
root@hpc-admin # ./install_packages.sh

You now have a working bare-metal server running Debian 9 with all the appropriate utilities and tools.

Network Configuration

For the VMs to work on the two network interfaces of the host, we need to create a bridge in each and assign the required static IPs, as well as enabling IP forward and creating the SSH keys and setting up Ansible's host file.

This is all done by the network_setup.sh script in our labconf repository:

root@hpc-admin # cd ~/labconf/network
root@hpc-admin # ./network_setup.sh <IF0> <IF1>

Change IF0 to your primary interface (the one connected to the firewall / VPN and IF1 to the one that will be connected to the BMCs (via the MrP VM).

Warning: This script will restart your network. It has been tested remotely (via SSH), but you may want to have a physical terminal nearby just in case.

Setting up the VMs

With the network in place, you can create the VMs.

root@hpc-admin # cd ~/labconf/kvm
root@hpc-admin # ./jenkins_virt_install.sh
root@hpc-admin # ./mrp_virt_install.sh
root@hpc-admin # ./fileserver_virt_install.sh

For all, the preseed will setup statis IPs (10.40.0.11 and 10.40.0.12 and 10.40.0.13 respectively), and they should be visible from the wider network, including the host.

This is done to simplify VM migration and a potential new installation on a different server.

The network setup step above assumes the same IPs, so everything is fixed. In time we'll use configuration files so you don't have to change too many scripts.

Installing the MrP service

You need to run both KEA and MrP roles to install a fully working provisioner. This can be done via the infra-server playbook:

root@hpc-admin # cd ~/labconf/ans_setup_mrp/
root@hpc-admin # ./pre-setup.sh
root@hpc-admin # ansible-playbook mrp_setup.yml -vvv -u root

Ansible will start MrProvisioner automatically, so you should be able to just open the URL on your browser (assuming you have a route to the machine's IP):

The default authentication is (admin:linaro), please change it as soon as possible.

ISSUES:

  • Network setup for BMC network has conflicting static and dynamic ranges. This needs fixing.
  • KEA is build on the machine every time. This is ridiculously slow but we need KEA 1.2 and Debian only has 1.1. We need to find/create a package.

Installing the Jenkins service

Run Ansible and wait until it exists with no errors:

root@hpc-admin # cd ~/labconf/ans_setup_jenkins
root@hpc-admin # ansible-playbook configure-jenkins.yml -vvv -u root

Ansible will start Jenkins automatically, so you should be able to just open the URL on your browser (assuming you have a route to the machine's IP):

If your Linaro login belongs to the hpc-sig-admin group, then you can directly login, as Jenkins is connected to LDAP, with your email and Linaro password.

BE CAREFUL: Jenkins is not yet using SSL, so your password will be passed plain text. Only use this if you are inside a VPN or on an isolated network.

You may get two warnings when you log in to Jenkins, which can be corrected on the Global Security screen:

  • ERROR in config.xml: Jenkins may complain "version 1.1" is not supported, only 1.0. Editing /var/lib/jenkins/config.xml and changing that on the first line seems to work.
  • Agent to master security subsystem is currently off: Go to Security Settings and check the box saying "Enable Agent → Master Access Control"
  • Jenkins instance uses deprecated protocols: JNLP3-connect: Go to Security Settings > Agents and clear the box "Java Web Start Agent Protocol/3" in "Agent Protocols"

Themes: Install the Simple Theme Plugin and choose one from the list by updating the theme URL in the general settings.

Save the configuration and you should be all set.

Installing the Jenkins Jobs

Clone the repository

root@hpc-admin # git clone https://github.com/Linaro/hpc_lab_jenkins.git
root@hpc-admin # cd hpc_lab_jenkins

Create the authorisation files

You need to find your API token in Jenkins. That's done by clicking on your username (top right corner) > Configure > API Token > Show API Token.

This will show your user ID and token.

hpc_lab_jenkins/vars/jenkins_cred.yml.secret:
user: user@linaro.org
password: {TOKEN}
url: http://10.40.0.12:8080

NOTE: The API TOKEN is the one from the admin user, not regular users.

Also, you need a token for Mr-Provisioner, to upload the preseeds. If you haven't got one yet, create it on the UI by clicking on your username's link (top right) > Tokens > "+". This will create a token.

Create a new file and copy the token hash into it.

hpc_lab_jenkins/vars/mrp_creds.yml.secret:
mr_provisioner_auth_token: {TOKEN}

Run the playbook

Since we're pushing changes to Mr-Provisioner, we've added it's Ansible as a submodule. So first, you need to update it:

root@hpc-admin # git submodule update --init --recursive

Then run the playbook:

root@hpc-admin # ansible-playbook -vvv -u root hpc_jobs_deploy.yml

Create the users accounts

Mr-Provisioner

You need to create their accounts by hand in MrP, add their ssh keys and generate their MrP token. This requirement will be dropped when bug 102 is fixed.

Jenkins

Then add those tokens to the list in hpc_lab_job_deploy/vars/jslave_tokens.yml.secret in the following format :

Error rendering macro 'code': Invalid value specified for parameter 'lang'
jslave_tokens:
  - jslave: jslave-d03-benchmark
    token: APITOKEN
  - jslave: jslave-d03-openhpc
    token: APITOKEN
etc...

and then run the playbook:

root@hpc-admin # ansible-playbouok -vvv -u root put_mrp_tokens.yml

Now that the SSH Keys, Tokens and accounts are in place, all you have to do is assign the slaves to the right machines and ensure that you use the jinja templating in the preseeds in MrP.

File Server

The file server VM setup job handles the user creation + SSH keys copy, but this will move to the hpc_lab_jenkins repo.

Updating Jenkins Jobs

Once the jobs are installed and working, on every change pertaining the Jenkins configuration, you just need to update the repo and run the same playbook again:

root@hpc-admin # cd hpc_lab_jenkins
root@hpc-admin # git fetch -a & git pull
root@hpc-admin # ansible-playbook -vvv -u root hpc_jobs_deploy.yml



  • No labels