Introduction
On Crusoe Kubernetes (CMK) worker nodes running containerd 2.0+, the soft limit for open files (RLIMIT_NOFILE) defaults to an unexpectedly low value of 1024, while the hard limit remains high at 524288.
This behavior stems from an upstream change in containerd 2.0 where LimitNOFILE was removed from the containerd.service definition. Without this override, systemd passes its own default soft limit of 1024 down to the containerd daemon, and containers inherit that limit. The result is file descriptor exhaustion under even moderate load — process crashes, failing unit tests, and connection errors in applications that expect a higher ceiling.
A permanent fix via updated worker node images is planned, but the workarounds below let you resolve the issue immediately without waiting for a platform update. This guide covers diagnosing the problem and applying either a zero-downtime process-level fix or a cluster-wide DaemonSet override.
Prerequisites
- Running CMK Cluster With Nodepools
-
kubectlCLI Installed and Configured With Your CMK Cluster's Kubeconfig (Get Kubeconfig)
Instructions
-
Verify the File Descriptor Limits
-
Confirm the current soft and hard limits inside a running container:
~ kubectl exec <Pod Name> -n <Namespace> -- sh -c 'cat /proc/self/limits | grep -i "open files"; echo soft=$(ulimit -Sn) hard=$(ulimit -Hn)'
Expected output when the issue is present:
Max open files 1024 524288 files soft=1024 hard=524288
- Take note of two things: the soft limit (
1024in this example) and the hard limit (524288). The hard limit is the ceiling — any unprivileged process can raise its own soft limit up to that value.
-
-
Option A: Raise the Soft Limit at the Process Level (Recommended)
- This is the preferred approach — it is non-disruptive, requires no cluster-level changes, and does not need root or privileged execution. Any process can raise its own soft limit up to the hard limit ceiling of
524288. -
Python implementation — add the following to your application's initialization routine or entrypoint script:
import resource # Retrieve current limits soft, hard = resource.getrlimit(resource.RLIMIT_NOFILE) # Raise the soft limit to match the maximum hard limit allowable resource.setrlimit(resource.RLIMIT_NOFILE, (hard, hard))
-
Go implementation — invoke the equivalent syscall at application startup:
var lim syscall.Rlimit syscall.Getrlimit(syscall.RLIMIT_NOFILE, &lim) lim.Cur = lim.Max syscall.Setrlimit(syscall.RLIMIT_NOFILE, &lim)
-
Verify — rerun your application workflows or unit tests. Assertions evaluating
resource.getrlimit(resource.RLIMIT_NOFILE)should now show:soft=524288 hard=524288
- This is the preferred approach — it is non-disruptive, requires no cluster-level changes, and does not need root or privileged execution. Any process can raise its own soft limit up to the hard limit ceiling of
-
Option B: Raise the Limit Cluster-Wide via a Host-Tuning DaemonSet (Advanced)
- If modifying application source code is not feasible, deploy a privileged DaemonSet that sets a higher
LimitNOFILEon each worker node's containerd service. -
Create the manifest — save the following YAML to a file named
containerd-limit-fix.yaml:apiVersion: apps/v1 kind: DaemonSet metadata: name: containerd-limit-fix namespace: kube-system labels: k8s-app: containerd-limit-fix spec: selector: matchLabels: name: containerd-limit-fix updateStrategy: type: RollingUpdate rollingUpdate: maxUnavailable: 1 template: metadata: labels: name: containerd-limit-fix spec: hostPID: true # nodeSelector: pin to a single canary node first, then widen once verified. containers: - name: limit-setter image: alpine:3.20 securityContext: privileged: true command: ["sh", "-c"] args: - | set -e # alpine's base image does not ship nsenter; install util-linux (requires registry/network access) apk add --no-cache util-linux >/dev/null nsenter --target 1 --mount --uts --ipc --net --pid -- sh -c ' mkdir -p /etc/systemd/system/containerd.service.d printf "[Service]\nLimitNOFILE=524288\n" > /etc/systemd/system/containerd.service.d/override.conf systemctl daemon-reload systemctl restart containerd ' echo "containerd LimitNOFILE override applied; recreate existing pods to pick up the new limit" sleep infinity -
Deploy the DaemonSet:
~ kubectl apply -f containerd-limit-fix.yaml
⚠️ Warning: Restarting the
containerdhost service via Option B will transiently affect active pods running on the target node. Running containers generally survive a containerd restart, but apply to a canary node first and widen gradually. For zero-downtime requirements, use Option A instead.
- If modifying application source code is not feasible, deploy a privileged DaemonSet that sets a higher
Example
A customer running a PyTorch data-loader pipeline on a 4-node CMK cluster sees their training job crash after roughly 30 minutes with OSError: [Errno 24] Too many open files. Running the verification command inside the pod confirms soft=1024 hard=524288 — the soft limit is the bottleneck. They add the four-line Python resource.setrlimit snippet to their training script's __main__ block and redeploy. The job completes without further file descriptor errors, and resource.getrlimit(resource.RLIMIT_NOFILE) now reports soft=524288 hard=524288.
Frequently Asked Questions
How do I know if my application is hitting the file descriptor limit?
Look for Too many open files errors in your application logs, or EMFILE / errno 24 in stack traces. These errors typically surface in I/O-heavy workloads — data loaders, web servers handling many connections, or applications that open many temporary files.
Will my changes persist across container restarts if I use Option A?
No. The process-level fix applies only to the process that calls it. If your container restarts, the soft limit resets to 1024. You must include the setrlimit call in your application's startup path so it runs on every launch. If you need a persistent fix that applies to all containers on a node regardless of application code, use Option B.