Introduction
This guide explains how to create SXM/Infiniband Virtual Machines (VMs) using Crusoe Cloud. You will learn three methods: using the User Interface (UI), the Command-Line Interface (CLI), and Terraform scripts, along with incorporating Lifecycle Scripts. These methods allow for flexible deployment of high-performance computing environments utilizing Infiniband networks.
Prerequisites
- Crusoe Cloud Account: Ensure you have access with the necessary permissions.
- Public SSH Key: A valid SSH key is required for VM access.
- CLI Installed: Install the Crusoe CLI and authenticate. Refer: Install and configure the CLI.
- Terraform Installed: Install and configure Terraform. Refer: Getting Started with Terraform.
- Access Permissions: Verify permissions for VM and Infiniband partition creation.
-
Startup Script: Ensure you have a startup script available (e.g.,
~/Downloads/startup-script.sh
).
Step-by-Step Instructions
Using the UI
-
Navigate to Crusoe Cloud Console:
- Log in to the Crusoe Cloud Console.
- Click the Compute tab in the left navigation panel.
-
Create a New Instance:
- Click the Instances tab.
- Click Create Instance.
-
Configure Instance Details:
- Instance Name: Provide a name.
-
Instance Type: Select
h100-80gb-sxm-ib.8x
ora100-80gb-sxm-ib.8x
. To learn more about instance types, please refer our public document. -
Location: Choose
us-southcentral1-a
or any other locations such aseu-iceland1-a
,us-northcentral1-a
,us-east1-a
. -
Image: Select
ubuntu22.04:latest
or any other images listed in our public document.
-
Network Settings:
- Select VPC Network, VPC Subnet, IB Network, and IB Partition.
- Note: IB partitions are region-specific.
-
Add Lifecycle Scripts:
- Upload or specify your startup script (
startup-script.sh
).
- Upload or specify your startup script (
-
Finalize and Create:
- Add SSH Key.
- Click Create.
Using the CLI
Run the following command:
crusoe compute vms create \
--name my-vm \
--type h100-80gb-sxm-ib.8x \
--location us-east1-a \
--image ubuntu22.04:latest \
--keyfile ~/.ssh/id_ed25519.pub \
--ib-partition-id 7c8eb124-44c3-4897-88b5-841442446c9b \
--startup-script ~/Downloads/startup-script.sh
Replace parameters as needed.
Using Terraform
// Crusoe Provider
terraform {
required_providers {
crusoe = {
source = "registry.terraform.io/crusoecloud/crusoe"
}
}
}
// Local files
locals {
ssh_key = file("~/.ssh/id_ed25519.pub") // Replace with the correct path to your public SSH key
startup_script = file("~/Downloads/startup-script.sh") // Ensure this file exists
}
// New VM with InfiniBand and Lifecycle Script
resource "crusoe_compute_instance" "infiniband_vm" {
name = "infiniband-vm"
type = "h100-80gb-sxm-ib.8x"
location = "us-southcentral1-a"
image = "ubuntu22.04:latest"
ssh_key = local.ssh_key
host_channel_adapters = [{
ib_partition_id = "ce608b10-6a3e-40aa-a391-aa4c39929ccf" // Replace with your actual InfiniBand partition ID
}]
network_interfaces = [{
subnet = "2c7b6b4f-e856-4eca-8f5a-ff594f76d90a" // Replace with the correct subnet ID for the location
public_ipv4 = {
type = "static"
}
}]
startup_script = local.startup_script
}
Terraform Deployment Steps:
- Initialize Terraform:
terraform init
- Validate the configuration:
terraform plan
- Apply configuration:
terraform apply
Example Use Case
A user deploying an h100-80gb-sxm-ib.8x
instance in us-southcentral1-a
for HPC workloads benefits from high-speed networking using Infiniband, while leveraging a startup script for automation.
Common Issues & Resolutions
-
Invalid IB Partition:
- Ensure the IB partition ID matches the VM region.
-
SSH Key Issues:
- Verify key format and upload correctly.
-
CLI Authentication Errors:
-
Check
.crusoe/config
parameters. Please refer: Install and configure the CLI.
-
-
Startup Script Not Executing:
- Verify the script has execution permissions and correct syntax.
Comments
0 comments
Please sign in to leave a comment.