Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

We're using the openstack developer clouds to test cluster deployments, but since that's a new product, it has its own teething issues. Here are a few tricks to make it work for HPC.

Preparing the Environment

Creating a "hopbox" Instance

...

The local machines can be called whatever you want, but make sure to match the pattern with wildcards (*). The two common ways is to have a prefix (ex. ohpc-*) or a suffix (ex. *.cloud), or both. These names are the ones that you have to put on your /etc/hosts file.

Associating an external IP

You can associate the external IP to the hopbox by clicking on "Associate Floating IP". You must choose the public IP and the "port" (which is just the local IP of the hopbox).

The UI is not stable enough, so the "port" won't show for a while. Wait a few minutes and reload the page. It may show up later, or it may timeout.

If it does time out, there is a way to do that with the client, but that requires having your Linaro password as an environment variable and it's not recommended. Contact the admins and they can help you.

After the IP is associated, you should make sure you can SSH into the hopbox via that IP. If you have changed the config as recommended above, this should "just work (tm)":

$ ssh debian@hopbox

You'll then have to create your Linaro user using your LDAP ID:

$ sudo useradd -m -u <LDAP-ID> <Linaro Username>

And copy the SSH key:

$ mkdir -m 700 /home/<Linaro User>/.ssh
$ cp ~/.ssh/authorized_keys /home/<Linaro User>/.ssh
$ chmod 600 /home/<Linaro User>/.ssh/authorized_keys

You can add any number of users, for all the people that will be able to access the lab and crate masters.

If the user you are creating is an administrator, make sure to allow sudo via "vigr" and adding it to the sudo group.

If all worked correctly, you should now log off and in again and just:

$ ssh hopbox

That should log you in as your Linaro user.

TODO: Make that setup LDAP aware via admin/user groups.

Creating an Internal Network

For OpenHPC, the usual setup is to have the master with two NICs, one on the external interface and one on the internal, and all slaves (compute nodes) with a single NIC on the internal network.

The master is then setup with DHCP, TFTP and NAT, and the slaves PXE boot on the internal network, where the master will provision it correctly.

To that have setup, we need an internal network, without DHCP, where all slaves' NICs will be attached to, as well as the second NIC on the master.

On the "Network Topology" tab, you can see the external network, your main gateway and your main network. Your hopbox will be connected to the main network.

Before you create masters and slaves, you need to create the internal network. You should have one internal network for every master, so be sure to name it accordingly.

Click on "+ Create Network" and fill in the fields. They should be:

  • Network
    • Name: Similar to your master's name, so that it's easy to match them.
    • Create Subnet: checked
  • Subnet
    • Subnet Name: can be the same name as the network, doesn't matter
    • Network address: can be anything different from your main network, ex. 172.22.16.0/24. It's good to keep it different from the other internal networks.
    • Gateway IP: is the IP of the master on the internal network, usually x.x.x.1
    • Disable Gateway: checked. This should avoid the network associating a non-1 IP to the master's internal NIC
  • Subnet details
    • Enable DHCP: disabled
    • Everything else empty

Creating a Master Instance

On the "Instances" tab, click on "Launch Instance" and fill in the fields. They should be:

  • Details
    • Name: a personalised name, with the SSH pattern above (ex. ohpc-master-yourname)
  • Source
    • Choose the CentOS 7 image (centos7-cloud) [1]
  • Flavour
    • You'll need at least 10GB of disk and 2 cores
  • Networks
    • Be careful, here's the trick: First move the internal network, then the main one. [2]

[1] The master instance needs to be either CentOS 7 or SLES 12. For now, we only have CentOS images, so choose that one.

[2] The order matters, because QEMU has trouble adding the interfaces. The first NIC will connect to the second network and vice-versa, and only the first network has DHCP set.

Everything else is irrelevant, just click "Launch Instance".

The instance will be created but not started. In the "Instances" list you need to click on "Start Instance".

If all went according to plan, you should be able to ssh to it via the hopbox. But since the user is still the default (centos, debian), you'll need additional setup.

Repeat the user setup steps above for the hopbox in this machine and once it's over, the redirect should just work:

$ ssh ohpc-master-yourname

And you should have a master with two NICs.

You need to identify your DHCP NIC (either eth0 or eth1) and setup the other one as fixed address with the gateway IP of the network you created.

Creating Slave Instances

Creating slaves are much simpler. You still should choose CentOS, 10+GB and at least two cores, but you only need one NIC, directly on the internal network you created.

TODO:

  • Find out the MAC address of the instances before boot.
  • Configure those addresses in the OpenHPC master
  • Setup slaves so they can PXE boot
  • Setup an instance without a pre-defined image, but just an empty disk