Skip to main content
Crusoe Support Help Center home page
Crusoe

How-To: Use Crusoe Object Storage (S3-Compatible API)

Akram Boudhraa
Akram Boudhraa
Updated

Introduction

Crusoe Cloud Object Storage is a high-performance, S3-compatible object store designed for AI and ML workloads. It lets you store datasets, model checkpoints, training artifacts, and other unstructured data using the same S3 API, tools, and libraries you already use — boto3, rclone, s3cmd, and more — without modification.

This guide walks you through creating Object Storage API credentials, creating a bucket, and uploading and downloading data using the three most common tools: the AWS CLI (boto3), rclone, and s3cmd.

💡 Limited Availability:  Object Storage is currently in Limited Availability. Contact your Crusoe account team or support to enable it for your organization.

Prerequisites

  • Active Crusoe Cloud account with Object Storage enabled
  • Crusoe CLI installed and configured (run: crusoe config init)
  • A Crusoe VM running in the same location as your bucket — Object Storage is a regional resource and is only accessible from VMs in the same location
  • Python 3 with boto3, rclone, or s3cmd installed depending on which tool you are using
  • Your project must be using NFS for Shared Disks (not legacy virtiofs). If you see an error about virtiofs, complete the virtiofs to NFS migration first

Key Concepts Before You Start

A few things about Crusoe Object Storage that are different from AWS S3:

  • Endpoint format: https://object.<location>.crusoecloudcompute.com — for example, https://object.us-east1-a.crusoecloudcompute.com
  • Path-style URLs only: Virtual hosted-style URLs are not supported. Every S3 client must be configured to use path-style access.
  • Separate API keys: Object Storage uses dedicated API keys (access key + secret key) that are separate from your Crusoe Cloud API tokens. Each account can have a maximum of 2 keys.
  • Buckets are managed via CLI/Console only: You cannot create or delete buckets using S3 clients — only the Crusoe CLI or Console can do this.
  • Pricing: $0.06 per GiB per month, billed on average hourly usage.

Instructions

Create an Object Storage API Key

  • Object Storage uses its own dedicated API keys, separate from your Crusoe Cloud API tokens. Generate one via the CLI:
crusoe storage tokens create --alias my-training-key
  • The command returns an Access Key ID and a Secret Key. Save both immediately:
Access Key ID:  CKIAXXXXXXXXXXXXXXXX
Secret Key:     SKXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Alias:          my-training-key

⚠️ Critical:  The secret key is shown only once and cannot be retrieved again. Store it securely immediately — in a password manager, environment variable, or a secrets manager. If you lose it, you must create a new key.

  • You can also create a key via the Console: navigate to your organization > Security > Object Storage Keys > Create Object Storage Key

Create a Bucket

  • Buckets are created using the Crusoe CLI or Console — S3 client tools cannot create or delete them. Create a bucket in the same location as your VM:
crusoe storage buckets create \
  --name my-training-data \
  --location us-east1-a
  • Bucket names must be globally unique across all Crusoe projects in a region, 3-63 characters, lowercase letters/numbers/hyphens only, starting and ending with a letter or number
  • List your buckets to confirm it was created: crusoe storage buckets list

💡 Tip:  Create your bucket in the same location as your compute (e.g., both in us-east1-a). Object Storage is regional — buckets in one location are not accessible from VMs in another.

Configure Your S3 Client

Choose the client that fits your workflow. All three options below work with Crusoe Object Storage.

Option A — boto3 (Python)

  • Install boto3 if not already present:
pip install boto3
  • Create an S3 client pointing to the Crusoe endpoint. No region_name is required — if your code requires one, use any placeholder value:
import boto3

s3 = boto3.client(
    "s3",
    endpoint_url="https://object.us-east1-a.crusoecloudcompute.com",
    aws_access_key_id="YOUR_ACCESS_KEY",
    aws_secret_access_key="YOUR_SECRET_KEY",
)

💡 Best practice:  Store credentials in environment variables rather than hardcoding them: export CRUSOE_S3_ACCESS_KEY=... and export CRUSOE_S3_SECRET_KEY=..., then read them with os.getenv() in your script.

Option B — rclone

  • Install rclone:
sudo apt-get install rclone
  • Add the following to ~/.config/rclone/rclone.conf (or run rclone config to configure interactively):
[crusoe]
type = s3
provider = Other
access_key_id = YOUR_ACCESS_KEY
secret_access_key = YOUR_SECRET_KEY
endpoint = https://object.us-east1-a.crusoecloudcompute.com
acl = private
force_path_style = true
  • The force_path_style = true setting is required — Crusoe Object Storage uses path-style URLs only

Option C — s3cmd

  • Install s3cmd:
sudo apt-get install s3cmd
  • Create or edit ~/.s3cfg with the following:
[default]
access_key = YOUR_ACCESS_KEY
secret_key = YOUR_SECRET_KEY
host_base = object.us-east1-a.crusoecloudcompute.com
host_bucket = object.us-east1-a.crusoecloudcompute.com
use_https = True
signature_v2 = False
  • Set host_bucket to the same value as host_base with no %(bucket)s prefix — this is required for path-style access

Upload Data

boto3

# Upload a single file
s3.upload_file("dataset.tar", "my-training-data", "datasets/dataset.tar")

# Upload a large file with multipart (recommended for files > 64 MB)
from boto3.s3.transfer import TransferConfig

config = TransferConfig(
    multipart_threshold=64 * 1024 * 1024,   # 64 MB
    multipart_chunksize=64 * 1024 * 1024,
    max_concurrency=10,
)
s3.upload_file("large-dataset.tar", "my-training-data", "datasets/large-dataset.tar", Config=config)

rclone

# Upload a single file
rclone copy ./dataset.tar crusoe:my-training-data/datasets/

# Upload a full directory with parallel transfers
rclone copy ./dataset/ crusoe:my-training-data/datasets/ \
  --transfers 16 \
  --checkers 8 \
  --s3-upload-concurrency 4

s3cmd

# Upload a file
s3cmd put dataset.tar s3://my-training-data/datasets/

# Upload a directory recursively
s3cmd put --recursive ./dataset/ s3://my-training-data/datasets/

# Upload large file with explicit multipart chunk size
s3cmd put --multipart-chunk-size-mb=64 large-dataset.tar s3://my-training-data/

Download Data

boto3

# Download a single file
s3.download_file("my-training-data", "datasets/dataset.tar", "./dataset.tar")

# List all objects in the bucket
response = s3.list_objects_v2(Bucket="my-training-data")
for obj in response.get("Contents", []):
    print(obj["Key"], obj["Size"])

rclone

# Download a single file
rclone copy crusoe:my-training-data/datasets/dataset.tar ./

# Download an entire directory
rclone copy crusoe:my-training-data/datasets/ ./local-datasets/

# Sync (mirror) a bucket path to a local directory
rclone sync crusoe:my-training-data/datasets/ ./local-datasets/

s3cmd

# Download a file
s3cmd get s3://my-training-data/datasets/dataset.tar ./

# Download a directory recursively
s3cmd get --recursive s3://my-training-data/datasets/ ./local-datasets/

# List objects in the bucket
s3cmd ls s3://my-training-data

Example

The following is a complete Python script that uploads a model checkpoint to Object Storage after training, then downloads it for evaluation on a second VM. This pattern is common in distributed training pipelines.

import os
import boto3
from boto3.s3.transfer import TransferConfig

# Read credentials from environment variables
s3 = boto3.client(
    "s3",
    endpoint_url="https://object.us-east1-a.crusoecloudcompute.com",
    aws_access_key_id=os.environ["CRUSOE_S3_ACCESS_KEY"],
    aws_secret_access_key=os.environ["CRUSOE_S3_SECRET_KEY"],
)

BUCKET = "my-training-data"

# ── Upload checkpoint after training ──────────────────────────────────────
config = TransferConfig(
    multipart_threshold=64 * 1024 * 1024,
    multipart_chunksize=64 * 1024 * 1024,
    max_concurrency=10,
)

print("Uploading checkpoint...")
s3.upload_file(
    "./checkpoints/epoch-100.pt",
    BUCKET,
    "checkpoints/epoch-100.pt",
    Config=config,
)
print("Upload complete.")

# ── Download checkpoint for evaluation ────────────────────────────────────
print("Downloading checkpoint for eval...")
s3.download_file(BUCKET, "checkpoints/epoch-100.pt", "./epoch-100.pt")
print("Download complete.")

Set credentials before running:

export CRUSOE_S3_ACCESS_KEY=CKIAXXXXXXXXXXXXXXXX
export CRUSOE_S3_SECRET_KEY=SKXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
python3 checkpoint_manager.py

Troubleshooting

403 Access Denied

Check your access key and secret key are correct. Confirm the endpoint URL matches the location of your bucket. Ensure path-style is configured: host_bucket = host_base in s3cmd, force_path_style = true in rclone.

Cannot create or delete buckets via S3 client

This is expected. Use the Crusoe CLI or Console to manage buckets: crusoe storage buckets create / crusoe storage buckets delete.

Virtiofs conflict error when enabling Object Storage

Your project is using the legacy virtiofs backend for Shared Disks. Complete the virtiofs-to-NFS migration first, then try again.

Bucket name already exists

Bucket names are globally unique across all Crusoe projects in a region. Add your org or project name to the bucket name to ensure uniqueness.

Slow upload/download speed

Use multipart uploads for files over 64 MB. Increase concurrency: max_concurrency in boto3, --transfers in rclone. Use 64 MB+ chunk sizes where possible.

Secret key lost or not saved

Delete the old key with crusoe storage tokens delete <access-key-id>. Create a new key with crusoe storage tokens create --alias new-key. Update all client configs with the new credentials.

Connection timed out

Verify your VM is in the same location as the bucket. Check the endpoint URL format: https://object.<location>.crusoecloudcompute.com. Object Storage is not reachable from the public internet.

InvalidRequest or NotImplemented error

You are using an unsupported S3 feature. SSE (server-side encryption), ACLs, cross-region replication, lifecycle policies, and event notifications are not currently supported.

Additional Resources

Related to

Was this article helpful?

0 out of 0 found this helpful

Still need help?

Our support team is ready to assist you with any questions.

Have more questions? Submit a request

Recently Viewed

Comments

0 comments

Article is closed for comments.