November 8, 2024
Overview
S3cmd is a tool for managing objects in Amazon S3 storage. CloudBucket, is an s3-compatible object store, and this document is a guideline on how to use s3cmd with your Cloud Bucket account.
You can get more information about CloudBucket.
Requirements
Basic Knowledge
s3cmd is a Linux command-line tool, you will need to be familiar with:
- Linux shell
-
Scientific Computing (SciC) Linux Clusters
CloudBucket Account
If you don’t have a CloudBucket account, visit the CloudBucket registration page to request a Pilot Service.
In order to replace the generic values in this guide with your own account information, you will need:
- The name of your assigned bucket
- CloudBucket account Access Key
- CloudBucket account Secret key
SciC Linux Clusters account
This guide demonstrates how to access your CloudBucket account from the SciC Linux Clusters, however CloudBucket access is not limited to the compute cluster - you can access CloudBucket from any computer within the Partners network.
Open an SSH terminal session on SciC Linux Clusters. You can visit the SciC Linux Clusters Quick Start guide and take a look at the different Scientific Computing training programs to learn more about using the cluster, and register for an account at the SciC Linux Clusters sign-up page
Create a Configuration File
First, we will create config file with your CloudBucket credentials and some other parameters. You'll need your CloudBucket access key and secret key.
For demonstration purposes, in this guide we are using:
- Access Key: u001
- Secret Key: s3%w3kj/(kl6w0lakj9JSMj8&we
[default]
access_key = "u001"
secret_key = "s3%w3kj/(kl6w0lakj9JSMj8&we"
host_base = erisecsrr2.partners.org
host_bucket = erisecsrr2.partners.org
enable_multipart = False
signature_v2 = True
recv_chunk = 262144
send_chunk = 262144
Basic Operations
There are two basic operations that you can perform on an Object Store, PUT and GET
How to PUT an Object
You can create a file by copying an existing local file, this way
s3cmd put FILE s3://BUCKET
FILE and BUCKET, have to get substituted for the real file and bucket names.
In this example, we are using "local_file" as the file name and "bucket01" as the bucket name.
s3cmd put local_file s3://bucket01
How to GET an Object.
You can get the file that you just uploaded.
s3cmd get s3://bucket01/local_file
Listing Objects in a Bucket
You can also list the content of the bucket.
s3cmd ls s3://bucket01/
Please, be aware that listing the objects in a bucket is an expensive operation and it will take longer as the number of objects in you bucket is increasing.
Keeping an external index with all the objects in your bucket is a good practice to get the best access performance.
Encrypting objects stored in CloudBucket
For any sensitive data it is a good practice for files to be encrypted before they are sent to the CloudBucket store. The process for using CloudBucket with s3cmd does not change if you enable encryption, although file transfer may be slightly slower than without encryption. To enable encryption generate an encryption passphrase and record it in a secure password manager. Then add the following lines to the file ~/.s3cfg
encrypt = True gpg_command = /usr/bin/gpg
gpg_decrypt = %(gpg_command)s -d --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s gpg_encrypt = %(gpg_command)s -c --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s gpg_passphrase = PUT_YOUR_SECURE_PASSPHRASE_HERE