More on the CAP thereom

There’s been even more discussion about CAP lately. The good news is, there’s now becoming more and more articles explaining the common misconceptions that arose in earlier discussions; and even Eric Brewer himself has recommended Coda Hale’s “You Can’t Sacrifice Partition Tolerance”.

In particular, it draws attention to a different way of categorising the failure semantics of distributed systems, by examining the tradeoff between “yield” (the fraction of queries which are rejected outright) and “harvest” (the fraction of the eligible rows returned by a query).

Non-fault-tolerant systems respond to failure by dropping the yield to zero; the system is shut down.

Fully replicated systems respond to faults by preserving yield and harvest, but reducing capacity; less nodes means less reads per second – but that might cause a loss of yield if the system is then over capacity and has to start rejecting queries. However, if the replication is performed asynchronously, then the system doesn’t actually provide 100% harvest even in the best case – as it may return out-of-date (and, therefore, incorrect) records. Whether you count an out-of-date record as totally worthless, or as having some fraction of the worth of the latest data, and the average fraction of records returned by queries that are actually out of date, affect exactly what this harvest will be.

Fully partitioned systems respond to faults by dropping harvest, as some fraction of the data becomes unreachable when the servers hosting it become unreachable.

Whether harvest or yield is more important, and how much potentially old replicas count towards your harvest, are business decisions, and vary between applications.

So where does GenieDB fit in this model?

Well, our central persistent store is a asynchronously fully replicated. The presence of full replication means we maintain 100% yield (except in REALLY catastrophic situations). Meanwhile, our consistency buffering algorithm means we provide a 100% harvest when there are no faults. But when there are faults, because the consistency buffer is fully partitioned, we return potentially out-of-date data for some fraction of the records. However, we mark those records with a flag indicating that they’re potentially outdated, so the application can choose whether to discard them (making them have a zero contribution towards the harvest), use them with the knowledge that they may be outdated, or just to use them anyway, (and, therefore, having some contribution towards the effective harvest, as they’re better than no records).

An application that is very fussy about data quality, so would rather trade off yield than harvest, can even reject the entire results of the query if any of the records are potentially outdated – thereby trading off harvest for yield.

And this is our corporate ethos – our database aims to provide developers with the flexibility to build different kinds of applications. And more importantly, to build applications that have different requirements in different parts of the system. GenieDB’s architecture makes it cheap for our database to provide the application with the best data it can, along with the information required for the application to choose its own tradeoff between harvest and yield – on a per-query basis.

Leave a Reply