Last Updated: 2020-11-05

What you will build

In this codelab, you are going to deploy an auto-scaling HPC cluster on Google Cloud that comes with the Slurm job scheduler. You will customize this system to deploy compute nodes with Paraview and then use this infrastructure to connect your local Paraview client with Paraview server deployed on ephemeral compute nodes on Fluid-Slurm-GCP.

This setup will allow you to leverage Google Cloud Platform as a Paraview render farm for visualization and post-processing of scientific data.

What you will learn

What you will need

Set IAM Policies

In HPC, there are clear distinctions between system administrators and system users. System administrators generally have "root access" enabling them to manage and operate compute resources. System users are generally researchers, scientists, and application engineers that only need to leverage the resources to execute jobs.

On Google Cloud Platform, the OS Login API provisions POSIX user information from GSuite, Cloud Identity, and Gmail accounts. Additionally, OS Login integrates with GCP's Identity and Access Management (IAM) system to determine if users should be allowed to escalate privileges on Linux systems.

In this tutorial, we assume you are filling the system administrator and compute engine administrator roles. We will configure IAM policies to give you sufficient permissions to accomplish the following tasks

To give yourself the necessary IAM roles to complete this tutorial

  1. Navigate to IAM & Admin > IAM in the Products and Services menu.
  2. Click "+Add" near the top of the page.
  3. Type in your GSuite account, Cloud Identity Account, or Gmail account under "Members"
  4. Add the following roles : Compute Admin, Compute OS Admin Login, and Service Account User
  5. Click Save

In this section you will configure your firewall rules in Google Cloud Platform to permit a reverse SSH connection from Paraview server to your local Paraview Client.

  1. Open your VPC Network Firewall Settings in Google Cloud.
  2. Click on "Create Firewall Rule"
  3. Set the Firewall Rule Name to "allow-pvserver-tcp"
  4. Set the Targets to "All instances in the network"
  5. For the Source IP Ranges, add your external IPv4 Address
  6. For the Ports and Protocols, check the box next to "tcp" and set the port to 11000
  7. Click Create.

In this section, you will deploy the Fluid-Slurm-GCP solution, an auto-scaling HPC cluster with the Slurm job scheduler and software that supports computational fluid dynamics workflows, including Paraview.

  1. Open https://console.cloud.google.com/marketplace/details/fluid-cluster-ops/cfd-gcp.
  2. Click "Launch"
  3. Give the deployment a name (e.g. paraview-demo) and select the GCP zone where you want to deploy your cluster.

  4. Leave the Controller and Login settings at their default settings.
  5. In the Partition Configuration section, set the partition name to ‘paraview', the Machine Type to `n1-standard-8`, and the Disk Size to 50 GB.
  6. Click "Deploy" and wait for the cluster to be created.

In this section of the codelab, you will configure the paraview partition to use the paraview-gcp image. Note that this image is provided as part of Fluid-Slurm-GCP and is licensed to you under the Fluid-Slurm-GCP EULA

  1. Log in to your cluster controller instance using ssh
  2. Go root.
sudo su
  1. Create a cluster-configuration file use the cluster-services CLI.
cluster-services list all > config.yaml
  1. Open config.yaml in a text editor and navigate to the partitions[0].machines[0] block. Insert an image definition for the machine block that points to projects/fluid-cluster-ops/global/images/paraview-gcp . Your machine block should look similar to the example block below.
  machines:
  - disable_hyperthreading: false
    disk_size_gb: 50
    disk_type: pd-standard
    external_ip: false
    gpu_count: 0
    gpu_type: nvidia-tesla-p4
    image: projects/fluid-cluster-ops/global/images/paraview-gcp
    local_ssd_mount_directory: /scratch
    machine_type: n1-standard-8
    max_node_count: 10
    n_local_ssds: 0
    name: paraview
    preemptible_bursting: false
    static_node_count: 0
    vpc_subnet: https://www.googleapis.com/compute/v1/projects/cloud-hpc-demo/regions/us-west2/subnetworks/default
    zone: us-west2-c
  1. Save the config.yaml file and exit your text editor.
  2. Use cluster-services to preview the changes to your paraview partition.
cluster-services update partitions --config=config.yaml --preview
  1. Apply the changes.
cluster-services update partitions --config=config.yaml

In this section of the codelab, you will configure the login node to permit reverse TCP connections back to your local workstation.

  1. Use ssh to access your cluster login node
  2. Go root
sudo su
  1. Open the /etc/ssh/sshd_config in a text editor.
  2. Add the following line to the bottom of the /etc/ssh/sshd_config file. Save your changes and exit the text editor.
GatewayPorts yes
  1. Restart the sshd service on the login node.
systemctl restart sshd

In this section of the codelab, you will set up a bash script that your Paraview client will use to launch Slurm batch jobs that start Paraview server on compute nodes.

  1. Use ssh to access your cluster login node
  2. Go root
sudo su
  1. Clone the fluid-slurm-gcp_custom-image-bakery repository
git clone https://github.com/fluidnumerics/fluid-slurm-gcp_custom-image-bakery.git
  1. Make a directory under /apps called /share
mkdir /apps/share
  1. Copy the submit-paraview.sh script to /apps/share
cp fluid-slurm-gcp_custom-image-bakery/examples/paraview/submit-paraview.sh /apps/share/
  1. Exit from root

exit

In this section, we will access the cluster's login node to configure Slurm accounting, so that you can submit jobs using the Slurm job scheduler.

  1. SSH into the cluster's login node
  2. Go root
sudo su
  1. Append a sample slurm_accounts block to the end of the config.yaml file.
cluster-services sample slurm_accounts >> config.yaml
  1. Edit the cluster-configuration file so that you are allowed to submit to the openfoam partition. Make sure you remove the empty slurm_accounts: [] that is pre-populated in the cluster-configuration file.
    The example slurm_account configuration below will create a Slurm account called cfd with the user joe added to it. Users in this account will be allowed to submit jobs to the paraview partition.
slurm_accounts:
  - allowed_partitions:
- paraview
    name: cfd
    users:
- joe
  1. Preview the changes for updating the slurm_accounts. Verify that you have entered in the Slurm accounting information correctly.
cluster-services update slurm_accounts --config=config.yaml --preview
  1. Apply the changes.
cluster-services update slurm_accounts --config=config.yaml 
  1. Exit from root.
exit

In this section, you will use Paraview on your local workstation to connect to paraview server, deployed to compute nodes on your cluster.

  1. On your local workstation, make a directory called paraview-pvsc/
mkdir paraview-pvsc
  1. Copy the paraview-gcp.pvsc file from your login node to paraview-pvsc/
scp USERNAME@LOGIN-IP:fluid-slurm-gcp_custom-image-bakery/examples/paraview/paraview-gcp.pvsc ./paraview-pvsc/
  1. Start paraview from your terminal on your workstation.
paraview &
  1. Click on the "Connect to Server" icon in the toolbar. This is the third icon from the left, near the open file icon.
  2. On the dialog that appears, click on Load Servers.
  3. Navigate to the paraview-gcp.pvsc file that you've copied from the cluster and click Open.
  4. Click Connect.
  5. Fill out the form that appears, using settings that are consistent with your cluster and your firewall rule settings. Specifically, make sure that the SSH username is your username on the cluster, the Login Node IP Address is the login node's external IP address, and the server port is set to 11000 (the same port we opened in the Firewall Configuration section of this tutorial).
  6. Click Ok.

From here, your paraview client will launch an Xterm window.

Within this window, a series of commands are run automatically for you.

Additionally, you will be able to monitor the status of the node configuration.

Once the job starts and the Paraview server is connected, you will be able to open files in your Paraview client that are located on your Fluid-Slurm-GCP cluster.

In this codelab, you created a cloud-native HPC cluster and connected your local Paraview client to Paraview server that runs on auto-scaling compute nodes on Google Cloud Platform!

Further reading

Tell us your feedback