CloudBucket storage access using s3cmd

Overview

S3cmd is a tool for managing objects in Amazon S3 storage. CloudBucket, is an s3-compatible object store, and this document is a guideline on how to use s3cmd with your Cloud Bucket account.

You can get more information about CloudBucket.

Requirements

These are the basic requirements that you’ll need to be able to follow the instructions.

Basic Knowledge

s3cmd is a Linux command-line tool, you will need to be familiar with:

  • Linux shell
  • Scientific Computing (SciC) Linux Clusters

CloudBucket Account

If you don’t have a CloudBucket account, visit the CloudBucket registration page to request a Pilot Service.

In order to replace the generic values in this guide with your own account information, you will need:

  • The name of your assigned bucket
  • CloudBucket account Access Key
  • CloudBucket account Secret key

SciC Linux Clusters account

This guide demonstrates how to access your CloudBucket account from the SciC Linux Clusters, however CloudBucket access is not limited to the compute cluster - you can access CloudBucket from any computer within the Partners network.

Open an SSH terminal session on SciC Linux Clusters. You can visit the SciC Linux Clusters Quick Start guide and take a look at the different Scientific Computing training programs to learn more about using the cluster, and register for an account at the SciC Linux Clusters sign-up page

Create a Configuration File 

First, we will create config file with your CloudBucket credentials and some other parameters. You'll need your CloudBucket access key and secret key. 

For demonstration purposes, in this guide we are using:

  • Access Key: u001
  • Secret Key: s3%w3kj/(kl6w0lakj9JSMj8&we
Edit the file ~/.s3cfg, and change or add the following parameters:
[default]
access_key = "u001"
secret_key = "s3%w3kj/(kl6w0lakj9JSMj8&we"
host_base = erisecsrr2.partners.org
host_bucket = erisecsrr2.partners.org
enable_multipart = False
signature_v2 = True

recv_chunk = 262144

send_chunk = 262144

 

Basic Operations 

There are two basic operations that you can perform on an Object Store, PUT and GET 

How to PUT an Object 

You can create a file by copying an existing local file, this way

s3cmd put FILE s3://BUCKET

FILE and BUCKET, have to get substituted for the real file and bucket names. 

In this example, we are using "local_file" as the file name and "bucket01" as the bucket name. 

 s3cmd put local_file s3://bucket01

How to GET an Object.

You can get the file that you just uploaded. 

s3cmd get s3://bucket01/local_file

Listing Objects in a Bucket

You can also list the content of the bucket.

s3cmd ls s3://bucket01/

Please, be aware that listing the objects in a bucket is an expensive operation and it will take longer as the number of objects in you bucket is increasing.

Keeping an external index with all the objects in your bucket is a good practice to get the best access performance. 

Encrypting objects stored in CloudBucket

For any sensitive data it is a good practice for files to be encrypted before they are sent to the CloudBucket store. The process for using CloudBucket with s3cmd does not change if you enable encryption, although file transfer may be slightly slower than without encryption.  To enable encryption generate an encryption passphrase and record it in a secure password manager. Then add the following lines to the file ~/.s3cfg

encrypt = True gpg_command = /usr/bin/gpg
gpg_decrypt = %(gpg_command)s -d --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s gpg_encrypt = %(gpg_command)s -c --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s gpg_passphrase = PUT_YOUR_SECURE_PASSPHRASE_HERE
Go to KB0028052 in the IS Service Desk

Related articles