Last Updated: 2020-11-05
In this codelab, you are going to deploy an auto-scaling HPC cluster on Google Cloud that comes with the Slurm job scheduler. You will customize this system to deploy compute nodes with OpenFOAM® installed and then use this infrastructure to simulate compressible flow past a NACA0012 aerofoil.
In HPC, there are clear distinctions between system administrators and system users. System administrators generally have "root access" enabling them to manage and operate compute resources. System users are generally researchers, scientists, and application engineers that only need to leverage the resources to execute jobs.
On Google Cloud Platform, the OS Login API provisions POSIX user information from GSuite, Cloud Identity, and Gmail accounts. Additionally, OS Login integrates with GCP's Identity and Access Management (IAM) system to determine if users should be allowed to escalate privileges on Linux systems.
In this tutorial, we assume you are filling the system administrator and compute engine administrator roles. We will configure IAM policies to give you sufficient permissions to accomplish the following tasks
To give yourself the necessary IAM roles to complete this tutorial
In this section, you will deploy the Fluid-Slurm-GCP solution, an auto-scaling HPC cluster with the Slurm job scheduler and software that supports computational fluid dynamics workflows, including Paraview.
In this section of the codelab, you will configure the openfoam partition to use the openfoam-gcp image. Note that this image is provided as part of Fluid-Slurm-GCP and is licensed to you under the Fluid-Slurm-GCP EULA
cluster-services list all > config.yaml
config.yamlin a text editor and navigate to the
partitions.machinesblock. Insert an image definition for the machine block that points to
projects/fluid-cluster-ops/global/images/openfoam-gcp. Your machine block should look similar to the example block below.
machines: - disable_hyperthreading: false disk_size_gb: 50 disk_type: pd-standard external_ip: false gpu_count: 0 gpu_type: nvidia-tesla-p4 image: projects/fluid-cluster-ops/global/images/openfoam-gcp local_ssd_mount_directory: /scratch machine_type: n1-standard-8 max_node_count: 10 n_local_ssds: 0 name: openfoam preemptible_bursting: false static_node_count: 0 vpc_subnet: https://www.googleapis.com/compute/v1/projects/cloud-hpc-demo/regions/us-east4/subnetworks/default zone: us-east4-a
cluster-servicesto preview the changes to your openfoam partition.
cluster-services update partitions --config=config.yaml --preview
cluster-services update partitions --config=config.yaml
In this section, we will access the cluster's login node to configure Slurm accounting, so that you can submit jobs using the Slurm job scheduler.
cluster-services sample slurm_accounts >> config.yaml
openfoampartition. Make sure you remove the empty
slurm_accounts: that is pre-populated in the cluster-configuration file.
slurm_accountconfiguration below will create a Slurm account called
cfdwith the user
joeadded to it. Users in this account will be allowed to submit jobs to the
slurm_accounts: - allowed_partitions: - meshing - openfoam - paraview name: cfd users: - joe
slurm_accounts. Verify that you have entered in the Slurm accounting information correctly.
cluster-services update slurm_accounts --config=config.yaml --preview
cluster-services update slurm_accounts --config=config.yaml
In this section, you will submit a Slurm batch job to run the NACA0012 tutorial included with OpenFOAM®. To help you with this, the Fluid-Slurm-GCP solution comes with an example Slurm batch script (
/apps/share/openfoam.slurm). This example batch script can also be used as a starting point for other OpenFOAM® jobs on the cluster.
git clone https://github.com/fluidnumerics/fluid-slurm-gcp_custom-image-bakery.git cp fluid-slurm-gcp_custom-image-bakery/examples/openfoam/openfoam.slurm ./
--accountparameter to the Slurm account name that you set in step 5 from the previous section of this codelab. Save the file when you are done and exit the text editor.
When the job completes, you will have the
aerofoilNACA0012 OpenFOAM® simulation case directory in your home directory.
ls aerofoilNACA0012/ 0 1050 1200 1350 150 300 450 550 700 850 Allclean dynamicCode log.transformPoints processor1 processor4 processor7 100 1100 1250 1400 200 350 50 600 750 900 Allrun log.blockMesh postProcessing processor2 processor5 system 1000 1150 1300 1410 250 400 500 650 800 950 constant log.extrudeMesh processor0 processor3 processor6
In this codelab, you created an auto-scaling, cloud-native HPC cluster and ran a parallel OpenFOAM® simulation on Google Cloud Platform!