I’ve been using DynamoDB for a few months now after re-architecting a system which started becoming painful to scale on a traditional RDBMS system. The problem wasn’t necessarily read/write performance but rather the total storage space needed as a lot of “unstructured” blobs was stored in the DB.
DynamoDB gives me a care free setup with individually scalable reads, writes, and storage, fully managed by AWS, making my life easier. I’d rather spend my time focusing on the business use case rather than managing infrastructure. Prior to picking DynamoDB I considered Apache Cassandra. Both share the same origins in the form of the Dynamo paper, and the hash + range key data model fits my use case perfectly.
Here’s why I didn’t pick Cassandra: For a decent minimum setup I’d argue you need at least three servers. Paying on-demand or reserved prices for three servers is one thing, but they also need someone to manage them ( = time/money ), including but not limited to OS, security, updates, backups, monitoring and other basic management tasks. You might find it enjoyable to tinker with these things, but I’d rather not. Sure, I’m fully capable of either knowing what to do or figure it out if needed, but I’d rather not spend the time on that right now (unless you pay me for it).
But it’s all about scale. At some point it’s more economical to manage software yourself, or manage your own hardware, or even build your own data centers. I’m far from most of these scale points, and will probably never need to bother with most of them. But when does it become more economical to run your own Cassandra cluster versus using DynamoDB?
It’s not a straightforward calculation, but Stackdriver recently published an interesting performance test, comparing the new AWS C3 instances against Google and Rackspace cloud servers. Using the most cost efficient setup from their study, could we calculate when you should switch from a managed NoSQL solution to a self-managed Cassandra installation?