Cloud Architecture and Open Source
Where does Open Source fit into the Cloud? Many places…
Cloud Server Software
This can come in two flavors: One which does a client-side cloud and the other which runs as a server for multiple clients. The client-side cloud basically does all the logical processing on the client and uses various cloud storage options. Server options are things like Eucalyptus and OpenStack.
There are various cloud clients with various licensing and business model schemes. We prefer an active community and open source development. Actually some of these are harder to find out there with the Google. The best will connect to multiple cloud systems as a client.
- CyberDuck S3, Google Docs, Google Storage, FTP, etc.
- Unilium A client for creating clouds from various sources, essentially software raid in the sky
Here is an article that discusses various cloud options and has the story mostly right.
Other Open Source S3 Options
Amazon and Google Clouds
For convenience, we like the Amazon Web Services and Google Storage clouds, though there is Rackspace and others. Amazon is the leader but Google (and others) are also creating developer-facing cloud infrastructure.
Next we have to talk about the data, and then we talk about principles. Data first.
The three dimensions here are file size (small to huge), importance of data (very to small inconvenience), and frequency of change. Here are some different kinds of data:
5mb files on average, a lot of them, significant annoyance if I lost all 63gb of my music collection, but the world would not end. Don’t need to sync or share. Maybe weekly or monthly backup is fine. Not a huge number of changes in a given week or month, though the iTunes data will change. I would pay $10/month to back this up, especially if I could access this via other devices, or even stream it remotely.
Huge 700mb files, many of these, something like 100gb. Small annoyance if lost (search and download time). Maybe every six months back this up. Possible not to include this on the online backup and just keep on a disk. Depends on the cost. I would pay $10/six months to back this up but not $10/month.
Now it turns out that Google is pricing at 0.25 USD/gb/year means 100gb is $25/year, just over $2/mo. This is the kind of pricing I could live with for permanent backup copy. As long as it wouldn’t cost more to get access to the files if and when needed (e.g., hard drive crash).
The principle with sync is that it is an automated backup and/or multiple devices have the same file. If restore or copy functions are good enough, then my particular use cases do not require synchronization. Maybe if there were heavy-duty collaboration between remote participants, but that is not the case.
Another feature of Sync (though this could just be good backup) is the need to do a difference update so only changed files (or better, changed bits) are copied/backed up.
Currently we use web mail (aka Google Hosted Apps). However, sharing through sync would be very useful because it could be on-demand sharing without notification needed.
- Easy first, otherwise it doesn’t happen. Same with cheap.
- Get a hosted service that works great at low price, and then construct a plan to backup that service.
That means Dropbox complemented by Unilium and Cyberduck. Not a bad set of choices at this stage in the evolution of the cloud.