This document is specific to the London RedCentric lab, but should evolve to a more generic setup once we have more labs. For now, there are some hard-coded logic in the wiki as well as the scripts, to make sure we can reproduce at least the one lab we have. Once we have more labs, we'll work to automate that using configuration files, command line options, etc.
London RedCentric
Our HPC Lab will be using the 10.50.0.0/16 network, using a VPN just for us. We will have no contact with any other lab, in or out.
The servers will receive static IP assignments in the 10.50.0.*/24 range, while the provisioner will work with IPs in the ranges:
- 10.50.10.*/20 to dynamic
- 10.50.20.*/20 to dynamic-reserved
- 10.50.30.*/20 to static
The second interface in the provisioner will be in a different sub-net (via VLAN) with fixed IPs (because MrP still can't DHCP) in the range 192.168.2.0/24. This is overly restrictive considering the ranges above, but it's enough for the London data-centre (we won't have more than 250 machines in there).
The masters and benchmark machines will be provisioned by MrProvisioner and the compute nodes will be provisioned by the master (ex. warewulf, xCAT, etc).
There will be a VLAN for each cluster, to allow internal communications without flooding the rest of the lab (including other clusters), and these will be GB, 10GBE or InfiniBand, in the ranges 172.16.0.0/15, 172.18.0.0/15 and 172.20.0.0/15 respectively, as each cluster can have more than one interconnect technology at the same time.
Here's a diagram of the network:
Setting up the hpc-admin node
...
Code Block | ||||
---|---|---|---|---|
| ||||
root@hpc-admin # apt update && apt upgrade
root@hpc-admin # apt install sudo git net-tools vim bridge-utils qemu-kvm libvirt-clients libvirt-daemon-system virtinst dirmngr build-essential
root@hpc-admin # echo "deb http://ppa.launchpad.net/ansible/ansible/ubuntu trusty main" >> /etc/apt/sources.list
root@hpc-admin # apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 93C4A3FD7BB9C367
root@hpc-admin # apt update && apt install ansible git qemu-kvm libvirt-clients libvirt-daemon-system |
You now have a working baremetal server running Debian9 with all the appropriate utilities and tools.
NOTE: To directly ssh in the server as root, you have to add your public SSH key to /root/.ssh/authorized_keys:
Code Block | ||
---|---|---|
| ||
root@hpc-admin # ssh-keygen -t rsa -b 2048 -N "" -f /root/.ssh/id_rsa root@hpc-admin # cp /root/.ssh/id_rsa.pub /root/.ssh/authorized_keys |
Setting up the VMs
First you need to setup libvirt's bridge adaptor, so we can bind the VMs network interfaces to it:
...
You may get two warnings when you log in to Jenkins, which can be corrected on the Global Security screen:
- Agent to master security subsystem is currently off: Check the box saying "Enable Agent → Master Access Control"
- Jenkins instance uses deprecated protocols: JNLP3-connect: Uncheck the box "Java Web Start Agent Protocol/3" in "Agent Protocols"
Save the configuration and you should be all set.
Installing the MrP service
...