10.06.06
Posted in Coding, Sysadmin, Tech/Geek at 3:20 pm by Craig
I’ve been thinking of using S3 to store backups of various machines (basically all linux/OSX ones), but what’s been holding me back is the inability of S3 to do rsync on the server side. rsync really needs an instance of rsync running “near” where the data is stored in order to do its cleverest compression/do-not-transmit smarts. Rsync is basically a win if you have a high-bandwidth link between the rsync server and the backing store, and a lower bandwidth link beterrn the rsync server and client. With S3, you’d have to run the rsync server side yourself, remote from S3, which kind of defeats the purpose of rsync…
But then I had a brainstorm. Amazon’s ECC service, which parallels S3, allows you to create a virtual machine and turn it on/off as needed in the amazon compute cloud. The ECC instances have high-bandwidth connectivity to S3 storage, and so would be ideal for running an rsync server! You can set up an ECC instance which serves rsync, and then your backup script can turn the instance on, do the rsync, then shut the instance down when it’s done.
Now all I have to do is actually create the ECC instance, then create some kind of wrapper around the whole thing which does the startup-backup-shutdown wrapping around the ECC API, and voila!
Permalink