Skip to end of banner
Go to start of banner

HPC Developer Cloud

Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Current »

We're using the openstack developer clouds to test cluster deployments, but since that's a new product, it has its own teething issues. Here are a few tricks to make it work for HPC.

Creating a "hopbox" Instance

Since we only have one public IP per lab, we need a machine we can SSH into and then hop to others. These are the "Hopboxes" and they can be any flavour of Linux.

A stable Debian is recommended, as you won't have to change it or re-install many packages, and you need a stable and secure setup.

The hopbox machine should be a tiny instance (one core, 512MB RAM), have SSHD running and nothing else. Default Debian SSHD config forbids password login, so that's solved.

To SSH into other machines, you'll need two things:

  1. Add the machine names and their IPs in /etc/hosts, so that once you ssh into the hopbox, you can ssh into the machines via their names.
  2. Add as SSH-hop configuration to your local .ssh/config, so that you can directly ssh into the machines, via the hopbox.

The SSH config should look something like:

Host hopbox
User <your-linaro-username>
HostName <the-public-IP>
Host ohpc-* *.cloud
User <your-linaro-username>
StrictHostKeyChecking no // Only here, not in the hopbox
ProxyCommand ssh hopbox nc -q0 %h %p 2>/dev/null

The hopbox should be pretty stable and not change the SSHD keys, so if you get a warning that it changed, something is wrong. The safest way is to login the cloud interface and restore the instance to a known snapshot.

But the internal instances can change all the time, so it's simpler to have it not checking the host keys.

The local machines can be called whatever you want, but make sure to match the pattern with wildcards (*). The two common ways is to have a prefix (ex. ohpc-*) or a suffix (ex. *.cloud), or both. These names are the ones that you have to put on your /etc/hosts file.

Associating an external IP

You can associate the external IP to the hopbox by clicking on "Associate Floating IP". You must choose the public IP and the "port" (which is just the local IP of the hopbox).

The UI is not stable enough, so the "port" won't show for a while. Wait a few minutes and reload the page. It may show up later, or it may timeout.

If it does time out, there is a way to do that with the client, but that requires having your Linaro password as an environment variable and it's not recommended. Contact the admins and they can help you.

After the IP is associated, you should make sure you can SSH into the hopbox via that IP. If you have changed the config as recommended above, this should "just work (tm)":

$ ssh debian@hopbox

You'll then have to create your Linaro user using your LDAP ID:

$ sudo useradd -m -u <LDAP-ID> <Linaro Username>

And copy the SSH key:

$ mkdir -m 700 /home/<Linaro User>/.ssh
$ cp ~/.ssh/authorized_keys /home/<Linaro User>/.ssh
$ chmod 600 /home/<Linaro User>/.ssh/authorized_keys

You can add any number of users, for all the people that will be able to access the lab and crate masters.

TODO: Make that setup LDAP aware via admin/user groups.

Creating an Internal Network

For OpenHPC, the usual setup is to have the master with two NICs, one on the external interface and one on the internal, and all slaves (compute nodes) with a single NIC on the internal network.

The master is then setup with DHCP, TFTP and NAT, and the slaves PXE boot on the internal network, where the master will provision it correctly.

To that have setup, we need an internal network, without DHCP, where all slaves' NICs will be attached to, as well as the second NIC on the master.

On the "Network Topology" tab, you can see the external network, your main gateway and your main network. Your hopbox will be connected to the main network.

Before you create masters and slaves, you need to create the internal network. You should have one internal network for every master, so be sure to name it accordingly.

Click on "+ Create Network" and fill in the fields. They should be:

  • Network
    • Name: Similar to your master's name, so that it's easy to match them.
    • Create Subnet: checked
  • Subnet
    • Subnet Name: can be the same name as the network, doesn't matter
    • Network address: can be anything different from your main network, ex. 172.22.16.0/24. It's good to keep it different from the other internal networks.
    • Gateway IP: is the IP of the master on the internal network, usually x.x.x.1
    • Disable Gateway: checked. This should avoid the network associating a non-1 IP to the master's internal NIC
  • Subnet details
    • Enable DHCP: disabled
    • Everything else empty

Creating a Master Instance

On the "Instances" tab, click on "Launch Instance" and fill in the fields. They should be:

  • Details
    • Name: a personalised name, with the SSH pattern above (ex. ohpc-master-yourname)
  • Source
    • Choose the CentOS 7 image (centos7-cloud) [1]
  • Flavour
    • You'll need at least 10GB of disk and 2 cores
  • Networks
    • Be careful, here's the trick: First move the internal network, then the main one. [2]

[1] The master instance needs to be either CentOS 7 or SLES 12. For now, we only have CentOS images, so choose that one.

[2] The order matters, because QEMU has trouble adding the interfaces. The first NIC will connect to the second network and vice-versa, and only the first network has DHCP set.

Everything else is irrelevant, just click "Launch Instance".

Creating Slave Instances

  • No labels