This document is specific to the London RedCentric lab, but should evolve to a more generic setup once we have more labs. For now, there are some hard-coded logic in the wiki as well as the scripts, to make sure we can reproduce at least the one lab we have. Once we have more labs, we'll work to automate that using configuration files, command line options, etc.
London RedCentric
Our HPC Lab will be using the 10.50.0.0/16 network, using a VPN just for us. We will have no contact with any other lab, in or out.
The servers will receive static IP assignments in the 10.50.0.*/24 range, while the provisioner will work with IPs in the ranges:
- 10.50.10.*/20 to dynamic
- 10.50.20.*/20 to dynamic-reserved
- 10.50.30.*/20 to static
The second interface in the provisioner will be in a different sub-net (via VLAN) with fixed IPs (because MrP still can't DHCP) in the range 192.168.2.0/24. This is overly restrictive considering the ranges above, but it's enough for the London data-centre (we won't have more than 250 machines in there).
The masters and benchmark machines will be provisioned by MrProvisioner and the compute nodes will be provisioned by the master (ex. warewulf, xCAT, etc).
There will be a VLAN for each cluster, to allow internal communications without flooding the rest of the lab (including other clusters), and these will be GB, 10GBE or InfiniBand, in the ranges 172.16.0.0/15, 172.18.0.0/15 and 172.20.0.0/15 respectively, as each cluster can have more than one interconnect technology at the same time.
Here's a diagram of the network:
Setting up the hpc-admin node
The hpc-admin node will be the physical server hosting the MrProvisioner and Jenkins services for the HPC lab.
The baremetal installation is : a Debian9 (stretch) hosting the two services using KVM/QEMU for the moment (migration to Docker/Containers will be possible when MrP support for containers is production ready.)
Install Debian as you normally would for a server, do care to install the ssh server and to plan for enough space for the Jenkins logs (a bare minimum of 500Go for the Jenkins VM is desirable)
root@hpc-admin # apt update && apt upgrade root@hpc-admin # apt install sudo git net-tools vim bridge-utils qemu-kvm libvirt-clients libvirt-daemon-system virtinst dirmngr build-essential root@hpc-admin # echo "deb http://ppa.launchpad.net/ansible/ansible/ubuntu trusty main" >> /etc/apt/sources.list root@hpc-admin # apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 93C4A3FD7BB9C367 root@hpc-admin # apt update && apt install ansible git qemu-kvm libvirt-clients libvirt-daemon-system
You now have a working baremetal server running Debian9 with all the appropriate utilities and tools.
To directly ssh in the server as root, you have to add your public SSH key to /root/.ssh/authorized_keys:
root@hpc-admin # ssh-keygen -t rsa -b 2048 -N "" -f /root/.ssh/id_rsa root@hpc-admin # cp /root/.ssh/id_rsa.pub /root/.ssh/authorized_keys
Setting up the VMs
First you need to setup libvirt's bridge adaptor, so we can bind the VMs network interfaces to it:
root@hpc-admin # vim /etc/network/interfaces # auto enp0s25 # iface enp0s25 inet dhcp auto br0 iface br0 inet static address 10.50.0.2 netmask 255.255.0.0 gateway 10.50.0.1 dns-nameservers 10.50.0.1 bridge_ports enp0s25 bridge_stp off bridge_maxwait 0 bridge_fd 0 root@hpc-admin # systemctl restart networking
With the network in place, you can clone the HPC Lab Conf repository (you must be allowed and have your key in the private repo):
root@hpc-admin # git clone ssh://git@dev-private-git.linaro.org/hpc/labconf.git root@hpc-admin # cd labconf/kvm
Then create the Jenkins VM:
root@hpc-admin # ./jenkins_virt_install.sh
TODO: Write a mrp_virt_install.sh and a preseed_mrp.cfg a above, with the virt-install command below
root@hpc-admin # virt-install --virt-type kvm \ --name mrp-hpc \ --memory 4096 \ --vcpus 2 \ --disk size=20 \ --os-variant debian9 \ --network bridge=br0 \ --graphics none \ --initrd-inject=preseed.cfg \ --location http://cdn-fastly.deb.debian.org/debian/dists/stretch/main/installer-amd64/ \ --extra-args "console=ttyS0,115200n8 serial"
For both MrProvisioner and Jenkins, the preseed will setup statis IPs (10.50.0.3 and 10.50.0.4 respectively), and they should be visible from the wider network, including the host. This is done to simplify VM migration and a potential new installation on a different server.
Update the Ansible host configuration:
root@hpc-admin # vim /etc/ansible/hosts [jenkins] 10.50.0.4 [infra_servers] 10.50.0.3
Installing the Jenkins service
Clone the ansible repository to setup Jenkins:
root@hpc-admin # git clone https://github.com/BaptisteGerondeau/ans_setup_jenkins.git
Copy the secret files from our private repo there:
root@hpc-admin # git clone ssh://git@dev-private-git.linaro.org/hpc/labconf.git root@hpc-admin # cp -r labconf/roles/ ans_set_jenkins/
Then run Ansible and wait until it exists with no errors:
root@hpc-admin # cd ans_setup_jenkins && ansible-playbook configure-jenkins.yml -vvv -u root
Ansible will start Jenkins automatically, so you should be able to just open the URL on your browser (assuming you have a route to the machine's IP):
If your Linaro login belongs to the hpc-sig-admin group, then you can directly login, as Jenkins is connected to LDAP, with your email and Linaro password.
BE CAREFUL: Jenkins is not yet using SSL, so your password will be passed plain text. Only use this if you are inside a VPN or on an isolated network.
You may get two warnings when you log in to Jenkins, which can be corrected on the Global Security screen:
- Agent to master security subsystem is currently off: Check the box saying "Enable Agent → Master Access Control"
- Jenkins instance uses deprecated protocols: JNLP3-connect: Uncheck the box "Java Web Start Agent Protocol/3" in "Agent Protocols"
Save the configuration and you should be all set.
Installing the MrP service
Add your mrp-hpc IP to /etc/ansible/hosts under the tag "[infra_servers]"
Then :
root@hpc-admin # git clone https://github.com/niedbalski/infra-automation.git && cd infra_automation/ansible root@hpc-admin # ansible-playbook playbooks/infra-server.yml -vvv -u root
Then you can login to MrP on port 5000 (or whatever you set it to in the Ansible playbook) and login with the usual first install login (admin:linaro).