BigTable concerns, or “How to put your trust in the cloud”

I didn’t necessarily mean to piggyback off Greg’s two posts on ORMs but c’mon, what’s a hillbilly to do when he perpetuates such negative stereotypes? I mean, before you start knocking it, have any of you *tried* kissing your sister?

He also has some blather in there on RDMSs and ORMs. So I suppose I should hide my indignation behind something technical.

We’re using BigTable for our project by way of Google App Engine. The decision to use it was pretty easy once we landed on GWT as our platform. The integration ‘twixt GWT and App Engine is pretty seamless and hey, App Engine uses Big Table.

I’m glossing over the dozens of times we’ve second-guessed ourselves since making the decision though. The most recent was just yesterday as a matter of fact when my friend expressed a couple of concerns:

How do we back it up?
How do we do ad hoc reports against it?

Over the course of the conversation, these boiled down to: How can we work with BigTable in a way we’re used to with RDMSs?

The inevitable option came up. Maybe we shouldn’t use BigTable. Maybe MySQL is more suitable if we’re unsure. That means moving off App Engine though and we like what we’ve seen so far with it.

It was a bit of an uncomfortable conversation actually and this was between two very seasoned developers who have never shied away from new tech. I think the reason for the awkwardness though is that we aren’t dealing with someone else’s money. This is a startup so it’s a decision that he and I are going to have to live with.

In the end, being seasoned developers, we recognized that moving to a new development platform will just substitute one set of problems with another. For basic transactions (I hesitate to say OLTP because that will imply I know more about the term than I do), like getting some objects and saving them again, BigTable just plain works. There’s no ORM behind the scenes to map your data structure to the domain model. You create a User and you save it. Any relationships are automatically dealt with by some magic that is buried in the documentation somewhere, I’m sure. It really is like working with an ORM without actually having to deal with the mapping.

As for our two questions above, we have a tentative solution that I still like a day later and it will solve both problems. Let’s take the second one:

How do we do ad hoc reports against it?

See this is where RDMSs shine, I think. So breaking down the question we get: How can we get the benefits of BigTable for transactional stuff and the benefits of RDMSs for reporting and ad hoc querying?

Funny how CQRS starts to make sense when you have the right problem staring you in the face. We’ll have a separate relational database for querying. As requests are sent to BigTable, we’ll also dump them out to another service elsewhere that queues them up to be processed into the relational database.

This also addresses our first question:

How do we back it up?

The nice thing about this approach is that we now have our offline backup though of course, backing up is only half the solution. We also need some way to restore BigTable from our relational database easily. But the idea seems sound enough even if the mechanics may prove otherwise.

Maybe this sounds unduly complicated. It really doesn’t to me. App Engine and BigTable offer a lot of advantages. They solve problems I don’t want to deal with, most notably, scalability. The ones they introduce, backing up and querying, by contrast, are pretty simple. Besides which, I’m scheduled for Udi’s course in a couple of months anyway.

And for the record, I don’t have any sisters. Just three adventurous brothers.

Kyle the Restored