Last Updated: 2020-03-27

In this article, we assume that you have already completed the "Create a HPC Cluster on Google Cloud Platform" codelab and have an existing fluid-slurm-gcp cluster.

Use case for multi-zonal HPC cluster configurations on Google Cloud Platform

When designing cloud compute solutions for HPC, it is important to consider fault tolerance in addition to cost and performance. This consideration requires an understanding of your cloud provider's infrastructure and the tools available to take advantage of resources to minimize the impacts of potential service failures.

In this introductory section of the codelab, we will cover the basics of GCP regions and zones. We will then use this information to design a compute partition configuration that is less susceptible to service failures in a single GCP zone. In the following sections, we will walk you through implementing this high availability design with fluid-slurm-gcp.

Regions and Zones

Google Cloud Platform (GCP) offers on-demand access to compute, storage, and networking resources worldwide. A region is a specific geographical location where you can host your resources. Each region has one or more zones that correspond to distinct data-centers in that geographical location. As an example, europe-west1-a and europe-west1-b are two different zones in the europe-west1 region. Geographically, europe-west1 is located in St. Ghislain, Belgium.

From Google Cloud's Compute Engine documentation :

"Putting resources in different zones in a region provides isolation from most types of physical infrastructure and infrastructure software service failures. Putting resources in different regions provides an even higher degree of failure independence. This allows you to design robust systems with resources spread across different failure domains."

Infrastructure

Fluid-slurm-gcp provides a schema that you can use to describe what machines you want to use on GCP, which regions and zones to deploy them in, and which Slurm partitions to align them with. This allows for a single high availability compute partition with identical machines spread across multiple zones.

This schematic provides an example of a fluid-slurm-gcp deployment in the us-west1 region (The Dalles, Oregon, USA). The login and controller instances reside in a single zone of us-west-1. The compute partition (called "high-availability") has three sets of machines : compute-a-*, compute-b-*, and compute-c-*. Each machine set is deployed in its own zone within us-west1. In this configuration, users can submit jobs to the high-availability partition and the Slurm job scheduler will schedule jobs to run in any of these zones.

What you will build

In this codelab, you are going to configure a high-availability (multi-zone) compute partition on an existing fluid-slurm-gcp HPC cluster on Google Cloud Platform.

What you will learn

What you will need

Fluid-slurm-gcp comes with a command-line tool called cluster-services that is used to manage available compute nodes, compute partitions, Slurm user accounting, and network attached storage. You can update your cluster configuration by providing cluster-services a .yaml file that defines a valid cluster configuration. By default, cluster-services looks for a cluster configuration file in /apps/cls/etc/cluster-config.yaml . Alternatively, you can use cluster-services to report your current cluster configuration that you can then modify.

Modifying compute partitions must be done on the controller instance of your cluster with root privileges. This is required, since the Slurm controller daemon must be restarted for the changes to take effect.

In this section, you will use cluster-services to create a valid cluster configuration yaml that you will modify in the next section.

  1. Navigate to Compute Engine > VM Instances in the Products and Services menu

  1. Click "SSH" to the right of your controller instance and wait for the terminal on the controller to become active.

  1. Generate a cluster-configuration file that describes your current cluster configuration.
$ sudo su
[root]# cluster-services list all > config.yaml
  1. Open the config.yaml file in either nano, emacs, or vim. The contents of this configuration file are described in detail in Fluid Numerics' cluster-config schema documentation. We will focus on the partitions configuration, which begins on line 21
  1 compute_image: projects/fluid-cluster-ops/global/images/fluid-slurm-gcp-compute-centos-v2-3-0
  2 compute_service_account: default
  3 controller:
  4   project: fluid-slurm-gcp-codelabs
  5   region: us-west1
  6   vpc_subnet: https://www.googleapis.com/compute/v1/projects/fluid-slurm-gcp-codelabs/regions/us-west1/subnetworks/default
  7   zone: us-west1-b
  8 controller_image: projects/fluid-cluster-ops/global/images/fluid-slurm-gcp-controller-centos-v2-3-0
  9 controller_service_account: default
 10 default_partition: partition-1
 11 login:
 12 - project: fluid-slurm-gcp-codelabs
 13   region: us-west1
 14   vpc_subnet: https://www.googleapis.com/compute/v1/projects/fluid-slurm-gcp-codelabs/regions/us-west1/subnetworks/default
 15   zone: us-west1-b
 16 login_image: projects/fluid-cluster-ops/global/images/fluid-slurm-gcp-login-centos-v2-3-0
 17 login_service_account: default
 18 mounts: []
 19 munge_key: ''
 20 name: fluid-slurm-gcp-1
 21 partitions:
 22 - labels:
 23     goog-dm: fluid-slurm-gcp-1
 24   machines:
 25   - disable_hyperthreading: false
 26     disk_size_gb: 15
 27     disk_type: pd-standard
 28     external_ip: false
 29     gpu_count: 0
 30     gpu_type: nvidia-tesla-v100
 31     image: projects/fluid-cluster-ops/global/images/fluid-slurm-gcp-compute-centos-v2-3-0
 32     local_ssd_mount_directory: /scratch
 33     machine_type: n1-standard-2
 34     max_node_count: 10
 35     n_local_ssds: 0
 36     name: partition-1
 37     preemptible_bursting: false
 38     static_node_count: 0
 39     vpc_subnet: https://www.googleapis.com/compute/v1/projects/fluid-slurm-gcp-codelabs/regions/us-west1/subnetworks/default
 40     zone: us-west1-b
 41   max_time: INFINITE
 42   name: partition-1
 43   project: fluid-slurm-gcp-codelabs
 44 slurm_accounts: []
 45 slurm_db_host: {}
 46 suspend_time: 300
 47 tags:
 48 - default

The partitions definition, in this cluster-configuration file, is specified between lines 21-43. The partitions attribute is a list of objects. Each partitions object has the attributes labels, machines, max_time, name, and project.

The partitions.machines attribute is also a list of objects. Each partitions.machines attribute defines a set of machines that you want to place in the Slurm partition defined by the parent partitions object. Take some time to review the partitions object schema before moving onto the next section.

At this point, you now have a configuration file, config.yaml, in your home directory. You will now add two more machine sets to the first partitions object that define identical machine types in different zones.

By the end of this section, you will have a multi-zone compute partition. This gives you access to compute resources across multiple Google data-centers in the same GCP region, resulting in a high availability configuration.

Rename the existing partition and machine set

  1. Open config.yaml (generated in the previous section) in your favorite text editor.
  2. Change the partition name on line 42 to high-availability
 42   name: high-availability
  1. Change the name of the machines in this partition, on line 36 to compute-w1-b
 36   name: compute-w1-b

Add machine sets in two additional zones

  1. Copy the machines block in the first-partition specification (lines 25-40) and paste below line 40.
  2. Modify the second machines block to change the zone to us-west1-c and the machine name to compute-w1-c
  3. Repeat steps 1-2 to create another machine block in us-west1-a. After completing this step, your machines block should look like the following:
 25   - disable_hyperthreading: false
 26     disk_size_gb: 15
 27     disk_type: pd-standard
 28     external_ip: false
 29     gpu_count: 0
 30     gpu_type: nvidia-tesla-v100
 31     image: projects/fluid-cluster-ops/global/images/fluid-slurm-gcp-compute-centos-v2-3-0
 32     local_ssd_mount_directory: /scratch
 33     machine_type: n1-standard-2
 34     max_node_count: 10
 35     n_local_ssds: 0
 36     name: compute-w1-b
 37     preemptible_bursting: false
 38     static_node_count: 0
 39     vpc_subnet: https://www.googleapis.com/compute/v1/projects/fluid-slurm-gcp-codelabs/regions/us-west1/subnetworks/default
 40     zone: us-west1-b
 41   - disable_hyperthreading: false
 42     disk_size_gb: 15
 43     disk_type: pd-standard
 44     external_ip: false
 45     gpu_count: 0
 46     gpu_type: nvidia-tesla-v100
 47     image: projects/fluid-cluster-ops/global/images/fluid-slurm-gcp-compute-centos-v2-3-0
 48     local_ssd_mount_directory: /scratch
 49     machine_type: n1-standard-2
 50     max_node_count: 2
 51     n_local_ssds: 0
 52     name: compute-w1-c
 53     preemptible_bursting: false
 54     static_node_count: 0
 55     vpc_subnet: https://www.googleapis.com/compute/v1/projects/fluid-slurm-gcp-codelabs/regions/us-west1/subnetworks/default
 56     zone: us-west1-c
 57   - disable_hyperthreading: false
 58     disk_size_gb: 15
 59     disk_type: pd-standard
 60     external_ip: false
 61     gpu_count: 0
 62     gpu_type: nvidia-tesla-v100
 63     image: projects/fluid-cluster-ops/global/images/fluid-slurm-gcp-compute-centos-v2-3-0
 64     local_ssd_mount_directory: /scratch
 65     machine_type: n1-standard-2
 66     max_node_count: 2
 67     n_local_ssds: 0
 68     name: compute-w1-a
 69     preemptible_bursting: false
 70     static_node_count: 0
 71     vpc_subnet: https://www.googleapis.com/compute/v1/projects/fluid-slurm-gcp-codelabs/regions/us-west1/subnetworks/default
 72     zone: us-west1-a
  1. Save config.yaml and return to the terminal.

Update your partitions

  1. Use cluster-services to update your cluster-configuration
[root]# cluster-services update partitions --config=config.yaml --preview
 ~ default_partition = partition-1 -> high-availability
 ~ partitions[0].machines[0].name = partition-1 -> compute-w1-b
 + partitions[0].machines[1] = {'disable_hyperthreading': False, 'disk_size_gb': 15, 'disk_type': 'pd-standard', 'external_ip': False, 'gpu_count': 0, 'gpu_type': 'nvidia-tesla-v100', 'image': 'projects/fluid-cluster-ops/global/images/fluid-slurm-gcp-compute-centos-v2-3-0', 'local_ssd_mount_directory': '/scratch', 'machine_type': 'n1-standard-2', 'max_node_count': 2, 'n_local_ssds': 0, 'name': 'compute-w1-c', 'preemptible_bursting': False, 'static_node_count': 0, 'vpc_subnet': 'https://www.googleapis.com/compute/v1/projects/fluid-slurm-gcp-codelabs/regions/us-west1/subnetworks/default', 'zone': 'us-west1-c'}
 + partitions[0].machines[2] = {'disable_hyperthreading': False, 'disk_size_gb': 15, 'disk_type': 'pd-standard', 'external_ip': False, 'gpu_count': 0, 'gpu_type': 'nvidia-tesla-v100', 'image': 'projects/fluid-cluster-ops/global/images/fluid-slurm-gcp-compute-centos-v2-3-0', 'local_ssd_mount_directory': '/scratch', 'machine_type': 'n1-standard-2', 'max_node_count': 2, 'n_local_ssds': 0, 'name': 'compute-w1-a', 'preemptible_bursting': False, 'static_node_count': 0, 'vpc_subnet': 'https://www.googleapis.com/compute/v1/projects/fluid-slurm-gcp-codelabs/regions/us-west1/subnetworks/default', 'zone': 'us-west1-a'}
 ~ partitions[0].name = partition-1 -> high-availability
[root]# cluster-services update partitions --config=config.yaml
  1. Verify that the partition name is now set to high-availability and the three sets of compute instances are available in Slurm
[root]# sinfo
PARTITION         AVAIL  TIMELIMIT  NODES  STATE NODELIST
high-availability    up   infinite      6  idle~ compute-w1-a-[0-1],compute-w1-b-[0-1],compute-w1-c-[0-1]

In the last section, you added machines across three zones in us-west1 to be available in a Slurm partition. We will now submit a test job that demonstrates all three zones are used in this configuration.

  1. Navigate back to your login node's terminal.
  2. Use srun to submit a job-step across all 6 nodes in the high-availability partition.
$ srun -N6 --partition=high-availability hostname
compute-w1-b-1
compute-w1-c-0
compute-w1-a-0
compute-w1-c-1
compute-w1-b-0
compute-w1-a-1

It can take anywhere from 1-2 minutes for the nodes to respond with the hostname. If you monitor the Compute Engine UI, you will be able to see the compute nodes going live across all zones in us-west1.

Congratulations! You have just created and tested a high availability compute partition on Google Cloud Platform.

In this codelab, you

What's next?

Learn how to configure a globally scalable compute partition (multi-region)

Submit your feedback and request new codelabs using our feedback form

Further reading

Learn how to configure OS-Login to ssh to your cluster with 3rd party ssh tools

Learn how to manage POSIX user information with the directory API

Reference docs

https://help.fluidnumerics.com/slurm-gcp