Skip to main content
Crusoe Support Help Center home page
Crusoe

How-to Resolve DiskPressure and Pod Evictions in Crusoe Managed Kubernetes (CMK) Nodepool VMs

Karan Solanki
Karan Solanki
Updated

Last Updated: March 23, 2026

Introduction

When a node in a Crusoe Managed Kubernetes (CMK) cluster enters a DiskPressure state, the kubelet begins evicting pods to protect the stability of the node. Currently, the boot disk is provisioned with a 128GB capacity, and this can prove to be insufficient when large container images are cached on the node or when application logs are written to the root partition. 

To resolve this, users must offload storage-intensive data (logs, datasets, and container overlays) to either local ephemeral NVMe or dynamically provisioned Persistent Disks.

Prerequisites

Before starting, ensure you have the following:

  • Running CMK cluster with Nodepool having GPU/CPU supporting NVMe.

  • The kubectl CLI Installed and configured with your CMK cluster's Kubeconfig (get Kubeconfig)

  • The latest version of the Crusoe CLI installed and authenticated to perform nodepool updates.

Step-by-Step Instructions

Step 1: Identify the source of Disk Pressure

Confirm the available space on the root partition and identify if the pressure is caused by system files, container images, or application data.

  • Check Root Partition Space:

    ~ df -h /

    If "Use%" is near 100%, the node will trigger evictions.

  • Identify Large Directories: Identify the top 10 largest directories on the root filesystem:

    ~ sudo du -xh / | sort -hr | head -n 10

    Note: The -x flag ensures you only scan the boot partition and do not traverse into large NVMe mounts.
    If /var/lib/containerd is at the top: Your disk is full of Container Images.
    If /var/lib/kubelet or /var/log is at the top: Your disk is full of Logs/Pod metadata.

  • Check for "Phantom" Deleted Files: If disk space is not reclaimed after deleting files, a process may still be holding the file open.Use this command to find them:
     

    ~ sudo lsof +L1 | grep deleted
  • Emergency Manual Cleanup: To immediately relieve pressure and stop active pod evictions, prune stale or unused container images:

    ~ sudo crictl rmi --prune

Step 2: Remediation Strategy by SKU

Identify your SKU type to select the correct path. 

Note: Kubelet configurations (such as imageGCHighThresholdPercent) are currently managed by Crusoe and cannot be modified by users.

Path A: For SKUs with Local NVMe

Note Applicable for NVIDIA H100, A100, H200, B200, GB200, AMD MI300X, and s1a family.

  1. Redirect Logs (Ephemeral): 

    If application logs (like Ray) are filling the disk, redirect them to the NVMe mount. 

    Note: This change will not persist across node reboots or recreations unless configured in your startup scripts or container entrypoint. 

    # Direct Ray logs to the NVMe mount 
    ~ ray start --head --logs-dir=/mnt/nvme0/ray_logs
  2. Enable NVMe for Containerd: 
    Update your nodepool template so that container images and ephemeral storage are automatically backed by NVMe

    ~ crusoe kubernetes nodepools update <ID> --ephemeral-storage-for-containerd true

    Note: After updating the template, you must recycle your Nodepool VMs (see Step 3) for the changes to take effect on existing hardware.

Path B: For SKUs without NVMe (or for Persistent Data)

Note Applicable for NVIDIA L40S, c1a SKUs, or any production application data.

Best Practice: Applications requiring large storage or data persistence should utilize Crusoe Persistent Disks or Shared Disks via the Crusoe CSI driver. Unlike local storage, these disks are independent of the VM lifecycle.

  1. Avoid hostPath: Do not use hostPath for critical application data. If a node fails or is replaced, data stored via hostPath is permanently lost.

  2. Move to Dynamic PV: If you are currently using hostPath or the local root partition to store application data, move this data to a dynamically provisioned Persistent Volume (PV) via a PersistentVolumeClaim (PVC). This offloads the storage weight from the 128GB boot disk to a separate, scalable block device.

  3. # Example PVC snippet
    spec:
      storageClassName: crusoe-flex-persistent-v1
      accessModes:
        - ReadWriteOnce
      resources:
        requests:
          storage: 500Gi
  4. Manage Containerd Storage: If the containerd images themselves (the "base" of your pods) exceed the 128GB limit and above options are not applicable to you, please contact Crusoe Support to to assess further options for these SKUs.

Step 3: Applying Updates (Node Replacement)

  • Cordon the Node: Mark the node as unschedulable to prevent new pods from landing on it.

    ~ kubectl cordon <node-name>
  • Drain the Node: Safely move existing pods to other available nodes. This ensures a graceful shutdown of your applications.

    ~ kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data
  • Delete the VM: Once the node is empty, delete the VM instance via the Crusoe Console.

    ~ crusoe compute vms delete <vm_name>
  • Auto-Replacement: CMK will detect the missing node and provision a new VM. This new VM will join the cluster with the updated template and the correct storage configuration.

Additional Resources

 

 

Related to

Was this article helpful?

0 out of 0 found this helpful

Still need help?

Our support team is ready to assist you with any questions.

Have more questions? Submit a request

Recently Viewed

Comments

0 comments

Article is closed for comments.