Skip to main content
Crusoe Support Help Center home page
Crusoe

How-To Resolve Low Open File Limits (RLIMIT_NOFILE) on CMK Nodes

Karan Solanki
Karan Solanki
Updated

Introduction

On Crusoe Kubernetes (CMK) worker nodes running containerd 2.0+, the soft limit for open files (RLIMIT_NOFILE) defaults to an unexpectedly low value of 1024, while the hard limit remains high at 524288.

This behavior stems from an upstream change in containerd 2.0 where LimitNOFILE was removed from the containerd.service definition. Without this override, systemd passes its own default soft limit of 1024 down to the containerd daemon, and containers inherit that limit. The result is file descriptor exhaustion under even moderate load — process crashes, failing unit tests, and connection errors in applications that expect a higher ceiling.

A permanent fix via updated worker node images is planned, but the workarounds below let you resolve the issue immediately without waiting for a platform update. This guide covers diagnosing the problem and applying either a zero-downtime process-level fix or a cluster-wide DaemonSet override.

Prerequisites

  • Running CMK Cluster With Nodepools
  • kubectl CLI Installed and Configured With Your CMK Cluster's Kubeconfig (Get Kubeconfig)

Instructions

  1. Verify the File Descriptor Limits
    • Confirm the current soft and hard limits inside a running container:

      ~ kubectl exec <Pod Name> -n <Namespace> -- sh -c 'cat /proc/self/limits | grep -i "open files"; echo soft=$(ulimit -Sn) hard=$(ulimit -Hn)'

      Expected output when the issue is present:

      Max open files            1024                 524288               files
      soft=1024 hard=524288
    • Take note of two things: the soft limit (1024 in this example) and the hard limit (524288). The hard limit is the ceiling — any unprivileged process can raise its own soft limit up to that value.
  2. Option A: Raise the Soft Limit at the Process Level (Recommended)
    • This is the preferred approach — it is non-disruptive, requires no cluster-level changes, and does not need root or privileged execution. Any process can raise its own soft limit up to the hard limit ceiling of 524288.
    • Python implementation — add the following to your application's initialization routine or entrypoint script:

      import resource
      
      # Retrieve current limits
      soft, hard = resource.getrlimit(resource.RLIMIT_NOFILE)
      
      # Raise the soft limit to match the maximum hard limit allowable
      resource.setrlimit(resource.RLIMIT_NOFILE, (hard, hard))
    • Go implementation — invoke the equivalent syscall at application startup:

      var lim syscall.Rlimit
      syscall.Getrlimit(syscall.RLIMIT_NOFILE, &lim)
      lim.Cur = lim.Max
      syscall.Setrlimit(syscall.RLIMIT_NOFILE, &lim)
    • Verify — rerun your application workflows or unit tests. Assertions evaluating resource.getrlimit(resource.RLIMIT_NOFILE) should now show:

      soft=524288 hard=524288
  3. Option B: Raise the Limit Cluster-Wide via a Host-Tuning DaemonSet (Advanced)
    • If modifying application source code is not feasible, deploy a privileged DaemonSet that sets a higher LimitNOFILE on each worker node's containerd service.
    • Create the manifest — save the following YAML to a file named containerd-limit-fix.yaml:

      apiVersion: apps/v1
      kind: DaemonSet
      metadata:
        name: containerd-limit-fix
        namespace: kube-system
        labels:
          k8s-app: containerd-limit-fix
      spec:
        selector:
          matchLabels:
            name: containerd-limit-fix
        updateStrategy:
          type: RollingUpdate
          rollingUpdate:
            maxUnavailable: 1
        template:
          metadata:
            labels:
              name: containerd-limit-fix
          spec:
            hostPID: true
            # nodeSelector: pin to a single canary node first, then widen once verified.
            containers:
            - name: limit-setter
              image: alpine:3.20
              securityContext:
                privileged: true
              command: ["sh", "-c"]
              args:
              - |
                set -e
                # alpine's base image does not ship nsenter; install util-linux (requires registry/network access)
                apk add --no-cache util-linux >/dev/null
                nsenter --target 1 --mount --uts --ipc --net --pid -- sh -c '
                  mkdir -p /etc/systemd/system/containerd.service.d
                  printf "[Service]\nLimitNOFILE=524288\n" > /etc/systemd/system/containerd.service.d/override.conf
                  systemctl daemon-reload
                  systemctl restart containerd
                '
                echo "containerd LimitNOFILE override applied; recreate existing pods to pick up the new limit"
                sleep infinity
    • Deploy the DaemonSet:

      ~ kubectl apply -f containerd-limit-fix.yaml
    • ⚠️ Warning: Restarting the containerd host service via Option B will transiently affect active pods running on the target node. Running containers generally survive a containerd restart, but apply to a canary node first and widen gradually. For zero-downtime requirements, use Option A instead.

Example

A customer running a PyTorch data-loader pipeline on a 4-node CMK cluster sees their training job crash after roughly 30 minutes with OSError: [Errno 24] Too many open files. Running the verification command inside the pod confirms soft=1024 hard=524288 — the soft limit is the bottleneck. They add the four-line Python resource.setrlimit snippet to their training script's __main__ block and redeploy. The job completes without further file descriptor errors, and resource.getrlimit(resource.RLIMIT_NOFILE) now reports soft=524288 hard=524288.

Frequently Asked Questions

How do I know if my application is hitting the file descriptor limit?

Look for Too many open files errors in your application logs, or EMFILE / errno 24 in stack traces. These errors typically surface in I/O-heavy workloads — data loaders, web servers handling many connections, or applications that open many temporary files.

Will my changes persist across container restarts if I use Option A?

No. The process-level fix applies only to the process that calls it. If your container restarts, the soft limit resets to 1024. You must include the setrlimit call in your application's startup path so it runs on every launch. If you need a persistent fix that applies to all containers on a node regardless of application code, use Option B.

Related Articles

Related to

Was this article helpful?

0 out of 0 found this helpful

Still need help?

Our support team is ready to assist you with any questions.

Have more questions? Submit a request

Related Articles

Recently Viewed

Comments

0 comments

Article is closed for comments.