Managing Amazon S3 Online Storage with S3sync


After trying to use Amazon’s S3 web service to backup files and to get a reliable download area for R functions and stuff which is not allowed to be uploaded to wordpress.com I ended up with some experimental “buckets” (= S3 online directory) and some 100 MB of files in them.

It turned out that it is not possible to delete a non-empty bucket from S3, so one is to required to recurse into the directories and delete all files one by one!

Eric Cheng and other blogs appearing after a google search pointed out S3sync as a suitable tool to remove a non-empty bucket.

So first one has to get Ruby and then also the OpenSSL interface for Ruby: sudo aptitude install ruby libopenssl-ruby

Then download s3sync (to your /home/yourself folder in this case) and unpack it: cd $HOME/
wget http://s3.amazonaws.com/ServEdge_pub/s3sync/s3sync.tar.gz tar xvzf s3sync.tar.gz
rm s3sync.tar.gz

This creates a s3sync folder containing the ruby code.

The package ca-certificates includes PEM files of CA certificates to allow SSL-based applications to check for the authenticity of SSL connections. It is needed to have the S3 connection secure via SSL and part of the default Ubuntu installation (at least included in my Xubuntu Karmic Koala. If not: sudo aptitude install ca-certificates

Before using s3sync get your access key and secret access key from Amazon. It has to be included in a file “s3config.yml” which is located in your home folder inside the directory “.s3conf” which has to be created. So:

mkdir $HOME/.s3conf
to create the directory.

Open your favorite text editor and create a plain textfile called s3config.yml inside the “.s3conf” folder which contains: aws_access_key_id: YourS3AccessKeyFromAmazon
aws_secret_access_key: YourS3SecretAccessKeyFromAmazon
SSL_CERT_DIR: /etc/ssl/certs

Prevent others from reading the configuration file containing your confidential access codes by chmod 700 $HOME/.s3conf/s3config.yml

Now you can start to use s3sync and s3cmd to manipulate your S3 storage space with e.g.: ruby $HOME/s3cmd.rb listbuckets

This was the first time I managed to manipulate successfully my S3 account. Ok, Djungledisk under Mac OS-X worked, but it is proprietary, though not expensive.

John Eberly’s blog was an inspiration to get started. Follow the link to his excellent blog post.

Advertisement

One thought on “Managing Amazon S3 Online Storage with S3sync

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s