Managing Amazon S3 Online Storage with S3sync

After trying to use Amazon’s S3 web service to backup files and to get a reliable download area for R functions and stuff which is not allowed to be uploaded to wordpress.com I ended up with some experimental “buckets” (= S3 online directory) and some 100 MB of files in them.

It turned out that it is not possible to delete a non-empty bucket from S3, so one is to required to recurse into the directories and delete all files one by one!

Eric Cheng and other blogs appearing after a google search pointed out S3sync as a suitable tool to remove a non-empty bucket.

So first one has to get Ruby and then also the OpenSSL interface for Ruby: sudo aptitude install ruby libopenssl-ruby

Then download s3sync (to your /home/yourself folder in this case) and unpack it: cd $HOME/ wget http://s3.amazonaws.com/ServEdge_pub/s3sync/s3sync.tar.gz tar xvzf s3sync.tar.gz rm s3sync.tar.gz
This creates a s3sync folder containing the ruby code.

The package ca-certificates includes PEM files of CA certificates to allow SSL-based applications to check for the authenticity of SSL connections. It is needed to have the S3 connection secure via SSL and part of the default Ubuntu installation (at least included in my Xubuntu Karmic Koala. If not: sudo aptitude install ca-certificates

Before using s3sync get your access key and secret access key from Amazon. It has to be included in a file “s3config.yml” which is located in your home folder inside the directory “.s3conf” which has to be created. So:

mkdir $HOME/.s3conf
to create the directory.

Open your favorite text editor and create a plain textfile called s3config.yml inside the “.s3conf” folder which contains: aws_access_key_id: YourS3AccessKeyFromAmazon aws_secret_access_key: YourS3SecretAccessKeyFromAmazon SSL_CERT_DIR: /etc/ssl/certs

Prevent others from reading the configuration file containing your confidential access codes by chmod 700 $HOME/.s3conf/s3config.yml

Now you can start to use s3sync and s3cmd to manipulate your S3 storage space with e.g.: ruby $HOME/s3cmd.rb listbuckets

This was the first time I managed to manipulate successfully my S3 account. Ok, Djungledisk under Mac OS-X worked, but it is proprietary, though not expensive.

John Eberly’s blog was an inspiration to get started. Follow the link to his excellent blog post.

One thought on “Managing Amazon S3 Online Storage with S3sync”

Andy says:

November 1, 2009 at 10:04

I always enjoy learning what other people think about Amazon Web Services and how they use them. Check out my very own tool CloudBerry Explorer that helps to
manage S3 on Windows . It is a freeware. http://cloudberrylab.com/

Rforge

Smart Computing with R ++

Managing Amazon S3 Online Storage with S3sync

Related

One thought on “Managing Amazon S3 Online Storage with S3sync”

Leave a comment Cancel reply