Managing Amazon S3 Online Storage with S3sync

After trying to use Amazon’s S3 web service to backup files and to get a reliable download area for R functions and stuff which is not allowed to be uploaded to I ended up with some experimental “buckets” (= S3 online directory) and some 100 MB of files in them.

It turned out that it is not possible to delete a non-empty bucket from S3, so one is to required to recurse into the directories and delete all files one by one!

Eric Cheng and other blogs appearing after a google search pointed out S3sync as a suitable tool to remove a non-empty bucket.

So first one has to get Ruby and then also the OpenSSL interface for Ruby: sudo aptitude install ruby libopenssl-ruby

Then download s3sync (to your /home/yourself folder in this case) and unpack it: cd $HOME/
wget tar xvzf s3sync.tar.gz
rm s3sync.tar.gz

This creates a s3sync folder containing the ruby code.

The package ca-certificates includes PEM files of CA certificates to allow SSL-based applications to check for the authenticity of SSL connections. It is needed to have the S3 connection secure via SSL and part of the default Ubuntu installation (at least included in my Xubuntu Karmic Koala. If not: sudo aptitude install ca-certificates

Before using s3sync get your access key and secret access key from Amazon. It has to be included in a file “s3config.yml” which is located in your home folder inside the directory “.s3conf” which has to be created. So:

mkdir $HOME/.s3conf
to create the directory.

Open your favorite text editor and create a plain textfile called s3config.yml inside the “.s3conf” folder which contains: aws_access_key_id: YourS3AccessKeyFromAmazon
aws_secret_access_key: YourS3SecretAccessKeyFromAmazon
SSL_CERT_DIR: /etc/ssl/certs

Prevent others from reading the configuration file containing your confidential access codes by chmod 700 $HOME/.s3conf/s3config.yml

Now you can start to use s3sync and s3cmd to manipulate your S3 storage space with e.g.: ruby $HOME/s3cmd.rb listbuckets

This was the first time I managed to manipulate successfully my S3 account. Ok, Djungledisk under Mac OS-X worked, but it is proprietary, though not expensive.

John Eberly’s blog was an inspiration to get started. Follow the link to his excellent blog post.