Welcome to the Google Codelab for running a Lustre Parallel file system cluster on Google Cloud Platform!
Data is core to the practice of High Performance Computing, and accessing large amounts of data at extremely high speeds and low latencies has always been a key challenge in running HPC workloads. This requirement for high performance storage has not changed in the cloud, and in fact the ability to utilize vast amounts of storage quickly and easily has become paramount.
HPC centers have long met this need on-premise using technologies like the Lustre parallel file system. Lustre is one of the most popular open source high performance storage solutions today, and since June 2005, it has consistently been used by at least half of the top ten, and more than 60 of the top 100 fastest supercomputers in the world. Lustre has the ability to scale up to hundreds of PB of capacity, and deliver the maximum performance for HPC jobs possible, with systems delivering TB/s of throughput in a single namespace.
In order to serve the demand for storage, Google Cloud has taken two approaches. First, GCP partnered with DDN to bring their supported, enterprise-class DDN EXAScaler Lustre software to the GCP Marketplace. Second, our engineers at Google Cloud have developed and open-sourced a set of scripts to easily configure and deploy a Lustre storage cluster on Google Compute Engine using the Google Cloud Deployment Manager.
Lustre on Google Cloud Platform is equally capable of delivering the maximum performance of the infrastructure it's running on. It's performance on GCP is so good that it placed 8th on the IO-500 storage system benchmark in 2019 with our partner DDN, representing the highest-ranking cloud-based file system on the IO-500. Today we will walk you through deploying the Open Source Deployment Manager scripts for Lustre. If you are interested in having an enterprise, hardened Lustre experience, with Lustre-expert support for your Lustre cluster, as well as features like a management and monitoring GUI or Lustre tunings, we recommend investigating the DDN EXAScaler Marketplace offering.
What you'll learn
- How to use the GCP Deployment Manager Service
- How to configure and deploy a Lustre file system on GCP.
- How to configure striping and test simple I/O to the Lustre file system.
- Google Cloud Platform Account and a Project with Billing
- Basic Linux Experience
Self-paced environment setup
Create a Project
Click Create Project.
Enter a project name. Remember the project ID (highlighted in red in the screenshot above). The project ID must be a unique name across all Google Cloud projects. If your project name is not unique Google Cloud will generate a random project ID based on the project name.
Next, you'll need to enable billing in the Developers Console in order to use Google Cloud resources.
Running through this codelab shouldn't cost you more than a few dollars, but it could be more if you decide to use more resources or if you leave them running (see "Conclusion" section at the end of this document). The Google Cloud Platform pricing calculator is available here.
New users of Google Cloud Platform are eligible for a $300 free trial.
Google Cloud Shell
While Google Cloud can be operated remotely from your laptop, in this codelab we will be using Google Cloud Shell, a command line environment running in the Cloud.
Launch Google Cloud Shell
From the GCP Console click the Cloud Shell icon on the top right toolbar:
Then click Start Cloud Shell:
It should only take a few moments to provision and connect to the environment:
This virtual machine is loaded with all the development tools you'll need. It offers a persistent 5GB home directory, and runs on the Google Cloud, greatly enhancing network performance and simplifying authentication. Much, if not all, of your work in this lab can be done with simply a web browser or a Google Chromebook.
Once connected to the cloud shell, you should see that you are already authenticated and that the project is already set to your PROJECT_ID:
$ gcloud auth list
Credentialed accounts: - <myaccount>@<mydomain>.com (active)
$ gcloud config list project
[core] project = <PROJECT_ID>
If the project ID is not set correctly you can set it with this command:
$ gcloud config set project <PROJECT_ID>
Updated property [core/project].
Download the Lustre Deployment Manager Scripts
In the Cloud Shell session, execute the following command to clone (download) the Git repository that contains the Lustre for Google Cloud Platform deployment-manager files:
git clone https://github.com/GoogleCloudPlatform/deploymentmanager-samples.git
Switch to the Lustre deployment configuration directory by executing the following command:
Configure Lustre Deployment YAML
Deployment Manager uses a YAML file to provide deployment configuration. This YAML file details the configuration of the deployment, such as the Lustre version to deploy, and the machine instance types to deploy. The file is configured by default to deploy in a new project without any quota increases, however you may change the machine type or capacity as desired for this codelab. This codelab is written to use these defaults, so if you do make any changes you must carry those changes throughout this codelab to avoid errors. In production, we recommend at least a 32 vCPUs instance for the MDS node, and at least an 8 or 16 vCPUs instance for the OSS nodes, depending on storage capacity and type.
To review or edit the YAML file in the Cloud Shell session, open the deployment configuration YAML file
Lustre-cluster.yaml. You can either use your preferred command line editor (vi, nano, emacs, etc.) or use the Cloud Console Code Editor to view the file contents:
The contents of the file will look like this:
# [START cluster_yaml] imports: - path: lustre.jinja resources: - name: lustre type: lustre.jinja properties: ## Cluster Configuration cluster_name : lustre zone : us-central1-f cidr : 10.20.0.0/16 external_ips : True ### Use these fields to deploy Lustre in an existing VPC, Subnet, and/or Shared VPC #vpc_net : < VPC Network Name > #vpc_subnet : < VPC Subnet Name > #shared_vpc_host_proj : < Shared VPC Host Project name > ## Filesystem Configuration fs_name : lustre ### Review https://downloads.whamcloud.com/public/ to determine version naming lustre_version : latest-release e2fs_version : latest ## Lustre MDS/MGS Node Configuration #mds_node_count : 1 mds_ip_address : 10.20.0.2 mds_machine_type : n1-standard-8 ### MDS/MGS Boot disk mds_boot_disk_type : pd-standard mds_boot_disk_size_gb : 10 ### Lustre MetaData Target disk mdt_disk_type : pd-ssd mdt_disk_size_gb : 1000 ## Lustre OSS Configuration oss_node_count : 4 oss_ip_range_start : 10.20.0.5 oss_machine_type : n1-standard-4 ### OSS Boot disk oss_boot_disk_type : pd-standard oss_boot_disk_size_gb : 10 ### Lustre Object Storage Target disk ost_disk_type : pd-standard ost_disk_size_gb : 5000 # [END cluster_yaml]
Within this YAML file there are several fields. Fields below with an asterisk (*) is required. These fields include:
- cluster_name* - Name of the Lustre cluster, prepends all deployed resources
- zone* - Zone to deploy the cluster into
- cidr* - IP range in CIDR format
- external_ips* - True/False, Lustre nodes have external IP addresses. If false then a Cloud NAT is setup as a NAT gateway
- vpc_net - Define this field, and the vpc_subnet field, to deploy the Lustre cluster to an existing VPC
- vpc_subnet - Existing VPC subnet to deploy Lustre cluster to
- shared_vpc_host_proj - Define this field, as well as the vpc_net and vpc_subnet fields, to deploy the cluster to a Shared VPC
File system Configuration
- fs_name - Lustre file system name
- lustre_version - Lustre version to deploy, use "latest-release" to deploy the latest branch from https://downloads.whamcloud.com/public/lustre/ or lustre-X.X.X to deploy any other versions
- e2fs_version - E2fsprogs version to deploy, use "latest" to deploy the latest branch from https://downloads.whamcloud.com/public/e2fsprogs/ or X.XX.X.wcX to deploy any other versions
- mds_ip_address - Internal IP Address to specify for MDS/MGS node
- mds_machine_type - Machine type to use for MDS/MGS node (see https://cloud.google.com/compute/docs/machine-types)
- mds_boot_disk_type - Disk type to use for the MDS/MGS boot disk (pd-standard, pd-ssd)
- mds_boot_disk_size_gb - Size of MDS boot disk in GB
- mdt_disk_type* - Disk type to use for the Metadata Target (MDT) disk (pd-standard, pd-ssd, local-ssd)
- mdt_disk_size_gb* - Size of MDT disk in GB
- oss_node_count* - Number of Object Storage Server (OSS) nodes to create
- oss_ip_range_start - Start of the IP range for the OSS node(s). If not specified, use automatic IP assignment
- oss_machine_type - Machine type to use for OSS node(s)
- oss_boot_disk_type - Disk type to use for the OSS boot disk (pd-standard, pd-ssd)
- oss_boot_disk_size_gb - Size of MDS boot disk in GB
- ost_disk_type* - Disk type to use for the Object Storage Target (OST) disk (pd-standard, pd-ssd, local-ssd)
- ost_disk_size_gb* - Size of OST disk in GB
Deploy the Configuration
In the Cloud Shell session, execute the following command from the
gcloud deployment-manager deployments create lustre --config lustre.yaml
This command creates a deployment named Lustre. The operation can take a few minutes to complete, so please be patient.
Once the deployment has completed you will see output similar to:
Create operation operation-1572410719018-5961966591cad-e25384f6-d4c905f8 completed successfully. NAME TYPE STATE ERRORS INTENT lustre-all-internal-firewall-rule compute.v1.firewall COMPLETED  lustre-lustre-network compute.v1.network COMPLETED  lustre-lustre-subnet compute.v1.subnetwork COMPLETED  lustre-mds1 compute.v1.instance COMPLETED  lustre-oss1 compute.v1.instance COMPLETED  lustre-oss2 compute.v1.instance COMPLETED  lustre-oss3 compute.v1.instance COMPLETED  lustre-oss4 compute.v1.instance COMPLETED  lustre-ssh-firewall-rule compute.v1.firewall COMPLETED 
Verify the Deployment
Follow these steps to view the deployment in Google Cloud Platform Console:
- In the Cloud Platform Console, open the Products & Services menu in the top left corner of the console (three horizontal lines).
- Click Deployment Manager.
- Click Lustre to view the details of the deployment.
- Click Overview - Lustre. The Deployment properties pane displays the overall deployment configuration.
- Click "View" on the Config property. The Config pane displays the contents of the deployment configuration YAML file modified earlier. Verify the contents are correct before proceeding. If you need to change a deployment configuration simply delete the deployment according to steps in "Clean Up the Deployment", and restart the deployment according to the steps in "Configure Lustre Deployment YAML".
- (Optional) Under the Lustre-cluster section, click each of the resources created by the Lustre.jinja template and review the details.
With the deployment's configuration verified let's confirm the cluster's instances are started. In the Cloud Platform Console, in the Products & Services menu, click Compute Engine > VM Instances.
On the VM Instances page, review the five virtual machine instances that have been created by the deployment manager. This includes lustre-mds1, lustre-oss1, lustre-oss2, lustre-oss3, and lustre-oss4.
Monitor the Installation
On the VM Instances page, click lustre-mds1 to open the Instance details page.
Click on Serial port 1 (console) to open the serial console output page. We will use this serial output to monitor the installation process of the MDS instance, and wait until the startup-script has completed. The node will reboot once to boot into the Lustre kernel, and display messages similar to below:
Startup finished in 838ms (kernel) + 6.964s (initrd) + 49.302s (userspace) = 57.105s. Lustre: lustre-MDT0000: Connection restored to 374e2d80-0b31-0cd7-b2bf-de35b8119534 (at 0@lo)
This means Lustre is installed on the Lustre cluster, and the file system is ready to be utilized!
Access the Lustre Cluster
In the Cloud Shell session, click the SSH button next to the lustre-mds1 instance in the Google Cloud Console. Alternatively, execute the following command in Cloud Shell, substituting <ZONE> for the lustre-mds1 node's zone:
gcloud compute ssh lustre-mds1 --zone=<ZONE>
This command logs into the lustre-mds1 virtual machine. This is the Lustre Metadata Server (MDS) instance, which also acts as the Lustre Management Server (MGS) instance. This instance handles all authentication and metadata requests for the file system.
Let's mount the file system on our lustre-mds1 instance in order to be able to test it later. Execute the following commands:
sudo mkdir /mnt/lustre sudo mount -t lustre lustre-mds1:/lustre /mnt/lustre cd /mnt/lustre
You've now mounted the Lustre file system at /mnt/lustre. Let's take a look at what we can do with this file system.
If you are not familiar with Lustre and it's tools, we will walk through a few important commands here.
Lustre's low-level cluster management tool is "lctl". We can use lctl to configure and manage the Lustre cluster, and to view the Lustre cluster's services. To view the services and instances in our new Lustre cluster, execute:
sudo lctl dl
You will see output similar to below, depending on what changes you made to the Lustre YAML configuration file:
0 UP osd-ldiskfs lustre-MDT0000-osd lustre-MDT0000-osd_UUID 11 1 UP mgs MGS MGS 12 2 UP mgc MGC10.128.15.2@tcp 374e2d80-0b31-0cd7-b2bf-de35b8119534 4 3 UP mds MDS MDS_uuid 2 4 UP lod lustre-MDT0000-mdtlov lustre-MDT0000-mdtlov_UUID 3 5 UP mdt lustre-MDT0000 lustre-MDT0000_UUID 12 6 UP mdd lustre-MDD0000 lustre-MDD0000_UUID 3 7 UP qmt lustre-QMT0000 lustre-QMT0000_UUID 3 8 UP lwp lustre-MDT0000-lwp-MDT0000 lustre-MDT0000-lwp-MDT0000_UUID 4 9 UP osp lustre-OST0000-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 4 10 UP osp lustre-OST0002-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 4 11 UP osp lustre-OST0001-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 4 12 UP osp lustre-OST0003-osc-MDT0000 lustre-MDT0000-mdtlov_UUID 4
We can see our Lustre Management Server (MGS) as item 1, our Lustre Metadata Server (MDS) as item 3, our Lustre Metadata Target (MDT) as item 5, and our four Lustre Object Storage Servers (OSS) as items 8 through 12. To understand what the other services are, please review the Lustre Manual.
Lustre's file system configuration tool is "lfs". We can use lfs to manage striping of files across our Lustre Object Storage Servers (OSS) and their respective Object Storage Targets (OST), as well as running common file system operations like find, df, and quota management.
Striping allows us to configure how a file is distributed across our Lustre cluster to deliver the best performance possible. While striping a large file across as many OSSs as possible often delivers the best performance by parallelizing the IO, striping a small file may lead to worse performance than if that file were only written to a single instance.
To test this, let's set up two directories, one with a stripe count of one OSS, and one with a stripe count of "-1", indicating that the files written in that directory should be striped across as many OSSs as possible. Directories can hold striping configurations that are inherited by files created within them, but sub-directories and individual files within that directory can then be configured to be striped differently if desired. To make these two directories, execute the following commands while in the "/mnt/lustre" directory:
sudo mkdir stripe_one sudo mkdir stripe_all sudo lfs setstripe -c 1 stripe_one/ sudo lfs setstripe -c -1 stripe_all/
You can view the stripe settings of a file or directory using lfs getstripe:
sudo lfs getstripe stripe_all/
You will see output showing the stripe count set as -1:
stripe_all/ stripe_count: -1 stripe_size: 1048576 pattern: raid0 stripe_offset: -1
Now we're ready to test the performance improvements achievable by writing a large file striped across multiple OSSs.
We will run two simple tests of the Lustre IO to demonstrate the possible performance advantages and scaling capabilities of the Lustre file system. First, we will run a simple test using the "dd" utility to write a 5GB file to our "stripe_one" directory. Execute the following command:
sudo dd if=/dev/zero of=stripe_one/test bs=1M count=5000
The process to write 5GB of data to the file system averages around 27 seconds, writing to a single Persistent Disk (PD) on a single Object Storage Server (OSS).
To test striping across multiple OSSs, and therefore multiple PDs, we simply need to change the output directory we write to. Execute the following command:
sudo dd if=/dev/zero of=stripe_all/test bs=1M count=5000
Notice we changed "of=stripe_one/test" to "of=stripe_all/test". This will allow our single stream write to distribute it's writes across all of our Object Storage Servers, and complete the write in on average 5.5 seconds, about 4x as quickly with four OSSs.
This performance continues to increase as you add Object Storage Servers, and you can add OSSs with the file system online and begin striping data to them to increase capacity and performance online. The possibilities are endless using Lustre on Google Cloud Platform, and we're excited to see what you can build, and what problems you can solve.
Congratulations, you've created a Lustre cluster on Google Cloud Platform! You can use these scripts as a starting point to build your own Lustre cluster, and to integrate it with your cloud-based computing cluster.
Clean Up the Deployment
Logout of the Lustre node:
You can easily clean up the deployment after we're done by executing the following command from your Google Cloud Shell, after logging out of the Lustre cluster:
gcloud deployment-manager deployments delete lustre
When prompted, type Y to continue. This operation can take some time, please be patient.
Delete the Project
To cleanup, we simply delete our project.
- In the navigation menu select IAM & Admin
- Then click on settings in the submenu
- Click on the trashcan icon with the text "Delete Project"
- Follow the prompts instructions
What we've covered
- How to use the GCP Deployment Manager Service.
- How to configure and deploy a Lustre file system on GCP.
- How to configure striping and test simple I/O to the Lustre file system.
Are you building something cool using the Lustre deployment manager scripts? Have questions? Chat with us in the Google Cloud Lustre discussion group. To request features, provide feedback, or report bugs please use this form, or feel free to modify the code and submit a pull request! Want to speak to a Google Cloud expert? Reach out to the Google Cloud team today through Google Cloud's High Performance Computing website.
Please submit feedback about this codelab using this link. Feedback takes less than 5 minutes to complete. Thank you!