Flow control in a distributed system

February 5th, 2010

As a replicated database, the core of our technology is multicasting update operations to a number of servers efficiently.

Most of the difficulty is making sure they all get there, even though networks may come and go, servers come and go, and so on. However, that’s not what I’m going to talk about today!

We’re going to look at the opposite problem – stopping too many updates coming through the system at once.

Read the rest of this entry »

Resilience

February 2nd, 2010

One of GenieDB’s design goals is resilience. Databases are one of the most critical points in application deployments. You can add more Web servers, and more load balancers, and more caching proxies, and just spread the incoming requests between them. If they break, you can just replace them, as they have no state beyond caches, and identical configurations.

But either way, few applications can run for long with their database down – and since the database is updated as well as read from, you can’t normally just run lots of copies of it, as they all need updating.

Which, of course, is why we decided to write a replicated database!

Read the rest of this entry »

NoSQL vs. SQL

February 1st, 2010

Your humble author was pressed, by persons who shall remain nameless, to face his nervousness about public speaking and give a five-minute lightning talk on the NoSQL movement at CloudCamp London January 2010.

Unfortunately, the event was filmed.

Further material may appear in future on my SkillsMatter profile or my CloudBook profile.

Why the most ‘alternatively dressed‘ (and certainly not the most attractive) member of the team should be asked to become the public face of the company is anyone’s guess; less technical materials will probably appear at GenieDB’s CloudBook profile.

Profiling and debugging a UNIX daemon

January 25th, 2010

There’s a number of tricks people use when profiling and debugging UNIX daemons.

The problem with a daemon is that it’s not generally connected to a user interface. It sits there in the background accepting requests through some communications medium, and sending responses back. By its very design, it’s meant to get out of the way and be unobtrusive.

So when you are trying to find out what a daemon is doing, in order to test it or to diagnose a problem, tricks are required.

The first trick is logging; make your daemon open up a file, or a channel to something like syslog, and log all sorts of internal events. This lets you extract a trace of events that occurred, in order, which is very useful; however, this can be complicated when your daemon has multiple worker processes or threads, as their outputs are interleaved. Also, it depends upon you putting logging commands in at the points you need them. If a fault occurs in a part of the system you’ve not instrumented in the way you need yet, you need to recompile it with more instrumentation, then attempt to recreate the fault. Also, all that logging takes up system resources, such as time and disk bandwidth; in the worst case, the logging can end up hogging disk bandwidth, and then the log writes start to block on available buffer space, which inserts large erratic delays into your daemon, possibly hiding the real issues or creating new ones.

As we are a distributed system, logging to files would be particularly annoying for us, as there would be a separate file on each server; correlating events between them would become unpleasant, and we’d need to drag them back to a central point for analysis in the first place. However, we have a multicast communication system in place anyway, so we use it for logging. Client and server processes all log into a special multicast group; then one or more monitoring nodes can run a special monitor that registers to receive the multicasts and spools them to disk in order (while also maintaining a ’scoreboard’ file listing all known servers, and their last-reported statistics). The multicasting is much lower-latency than a disk write, so the costs of logging are low, and unlikely to introduce long delays due to blocking on sparse disk bandwidth; the disk bandwidth used is on a dedicated machine. Also, we only log very interesting events by default; debug logging has to be turned on on a per-server basis, by enabling debug log flags.

The second trick is attaching gdb (or some other debugger) to a running daemon process, or to a core file from a crash. This is particularly excellent when your daemon has hung for some reason; once gdb is attached you can see what the current state of execution of all the daemon’s threads are, including all local and global variables. This very quickly tells you the nature of a deadlock; and it doesn’t require any planning ahead, beyond leaving debugging symbols on, which is cheap and easy. However, often the cause of a problem is far from where it’s detected; a pointer error can send a data structure awry and it only crashes some time later, in a perfectly valid and totally unrelated bit of code. A log file might be better at telling you where things went awry. gdb can also continue the daemon running, stopping when certain conditions are met, such as a variable changing in a certain way or a certain part of the code being reached, which can be similarly useful to logging – if you get gdb attached when the system is in a good state and you know a way to repeat the fault to get it into a bad state, you can watch what happens as the fault develops. However, if you don’t have a reproducible test case, this doesn’t help.

We use both tricks at GenieDB – but there’s another trick up our sleeve. Our daemon and all the GenieDB clients on a given server communicate via a shared memory segment; the primary function of this is so that all GenieDB processes on a node can share a sequence number counter, but it’s also used by the daemon to communicate global state to the clients easily; the list of tables and indices is there, so clients can access it easily, and read out information like the numbers of rows in each table without needing to communicate with the daemon process. However, we also use it to expose private state of the daemon (and instances of our client library inside client processes) for debugging purposes.

Many of the more interesting global variables of the daemon, for example, are placed into the shared memory area. We have a private tool that dumps them out, so we can check the internal state of the daemon without needing to attach gdb (and stopping the daemon in the process); this means we can run the tool under watch and eyeball them in another window while we run load tests.

But we can do more than that. Because updating a variable in shared memory is cheap, we can afford to do it lots – more often than we could justify logging statements. So we have a region in the shared memory for each client process and the daemon; and at all the interesting flow control points in the daemon and client library, we update an integer therein with the current line number, ORed with a few high bits to say which file it’s in. All of this is done with a macro, so we can leave it out of production builds.

Not only does this mean we can get a quick summary of the state of execution of all interesting processes, without interrupting them, in a single command; it also means we can sample this on a timer interrupt and accumulate a histogram, to tell where time is being spent when doing performance work.

But because updating and sampling these state variables is cheap, we’ve been inventive about what else we can track. At the core of the daemon is a thread that reads commands from a queue, and performs them; so we sample a high-resolution timer when reading from the queue then when processing a job. By taking differences, we know what proportion of the thread’s time is spent waiting for work, versus doing it. Taking a moving average of this gives us a ‘load percentage’, telling us how close to capacity this thread is, which is a good metric of how busy the node is, which goes in the shared memory segment. Along with the queue lengths.

This is a rather special queue, of course. As processing jobs from it is our overall bottleneck, we try and avoid putting jobs into it in the first place, if we can. Most of these jobs are updates to records on disk, it being a database. So the queue is indexed, meaning that when an update comes in from a client, we can consult the index to see if this update can be merged with an existing one. Perhaps two updates to the same record come along at once; the older of the two will be dropped. Perhaps a table is deleted while some updates to it are still in the queue. Perhaps two updates are to records that are stored together, in which case the two updates can be rolled into one larger one. Needless to say, we’re interested in how well this works, so we have counters for each case, in the shared memory segment; and our introspection tool lists them all, along with a computed percentage of all write operations successfully avoided. We record the timestamp at which various periodic internal processes occur, so we can see if they are overdue. We expose the various metrics the daemon uses to calculate how big its work backlog is to feed into the flow control system (the subject of a future blog post!), so we can see exactly what’s holding things up. And the list goes on.

Daemons like MySQL offer profiling and status-monitoring counters too, which you can get at via vendor-specific SQL queries. But all of that requires infrastructure; when we add a new counter, we declare it in the struct, make sure it’s initialised on system startup, and add a line to the introspection tool to print it out. We have to be careful about concurrent access to the shared state, but only if the counter is larger than a machine word (which the x86 platform guarantees can be atomically read and updated, even on SMP systems). We can afford to sample that a thousand times a second, if we want high-resolution profiling, while putting only a small load on the system. Try that with `show innodb status`!

Adding performance counters and the like is a painless task for us, so we do it a lot. This makes the internal state of the system highly transparent; not only can we hunt down problems and tune for performance more easily, we can quickly eyeball the numbers and see if everything is operating correctly during routine development. Currently, we’re not publishing the introspection tool for customer use; we expose the more interesting “overall load level” metrics via the aforementioned distributed cluster monitoring tool, which gathers them all at monitoring workstations once per second. But we’ll expose the underlying introspection tool, once we’ve documented it (which is the hard part) – advanced users doing advanced tuning will benefit from fine-grained metrics just as much as us developers do!

Eventual Consistency

January 13th, 2010

Most replicated databases provide ‘eventual consistency’.

The reason for this is simple: replication, by definition, means there are multiple copies of the data lying about. And whenever something is changed, that change has to be applied to all the copies. And since doing that takes time, there will inevitably be situations where some of the copies have performed a given update, and some haven’t. This is known as ‘inconsistency’ – the replicas not being all the same. Applications have to be written to expect this; if you issue an update to the database, you can’t expect to get the new data back if you select it immediately. If the user submits a form that changes how the next page is displayed, that page can’t just read the database to generate itself (as it might do when being visited *not* from the form); the page has to manually take account of the change to reflect it, as the database might not. And users get frustrated if they change something, but it doesn’t seem to ‘work’. They submit the change again and again, putting more load on your system.

Eventual consistency means that there may be inconsistency while updates are ‘in flight’, but once this (usually short) period is over, unless more updates arrive, the replicas will become consistent again; and any given change will ‘eventually’ happen everywhere.

It’s possible to avoid this; using transactions and a two-phase commit protocol or more advanced systems like Paxos, one can arrange that all transactions are globally serialized (or an illusion thereof), so no node can ever see an inconsistent database state. However, doing so puts load on the system due to the extra messages interchanged and the extra temporary storage required for in-progress transactions while they are being agreed, and harms application latency due to the extra synchronisation required before doing anything, and can harm throughput due to making things wait on locks, or retrying transactions due to optimistic concurrency control detecting a conflict – and it cannot be enforced at the same time as the system is running with a network partition; if a network link fails you have to either abandon consistency or bring all or part of the system to a halt until communication is restored. There can be no globally consistent state without a way of communicating globally!

So strictly consistent replication is only used for systems that need absolute consistency enough to justify the costs.

However, here at GenieDB, we’ve discovered a way to get strict consistency cheaply for most operations, while the network is fully connected; and when network links fail, we fall back to eventual consistency in a graceful manner. By “most operations”, I mean:

  • Any SELECT of a record by its primary key will consistently return the most recent state of that record, even if that is the deletion of the record
  • Any other SELECTs – by non-primary key columns, by range of primary keys, or whole table scans – will consistently return the most recent state of the records it finds (even if they are deleted), but may not see newly created records

The way it works is quite simple: whenever we update a record (including deletions), we hash the primary key to choose a ‘primary server’ for that record; and we synchronously send the new state of the record to that server. We start an asynchronous ‘eventual’ replication of the change to the nodes in the background, and return control to MySQL.

When we fetch a record by primary key, we ask the primary server for that record first to see if there’s a recent version; if so, we use that rather than reading from our local disk. If it’s down or unreachable, we just use our local copy after all – which might be inconsistent, but is better than nothing.

So once a write has returned as complete, a read will always return the new data, no matter if replication is still in progress.

When the database is read without knowing a specific primary key – those table scans and secondary-index scans – we consult our local disk to find eligible primary keys, then look them up on their primary servers. If the records have been deleted, or the secondary index field changed so they are no longer eligible for the query, they’re skipped. Otherwise, we return their consistent current values from the primary servers.

Distributing records by hashing is ordinarily used as a cache, such as memcache, to allow an application to retrieve information quickly without needing to send out relatively expensive SQL queries. As our core engine is a lot faster than a MySQL database, all this talking to primary servers actually harms our performance due to the network latency; so we allow the application to turn the consistency algorithm off on a per-operation basis, for situations where eventual consistency is good enough.

What we provide isn’t quite strict consistency; it’s strictly consistent for all but a few specific ‘tricky’ kinds of query, and even then, it’s almost strictly consistent; and even in the face of multiple network failures it can never be worse than eventual consistency. But that covers a lot of cases. Enough that, for many applications, they can be written to assume strict consistency, and the odds of a network outage being kept low enough that the cost of what amounts to a few outdated selects per year is negligible compare to the cost of engineering the application to cope with eventual consistency.

This kind of thing is the core of what we’re doing here at GenieDB – database technologies such as replication offer seductive advantages, but usually at some unpleasant cost. We’re finding ways to reduce those costs, to make the advantages of these technologies more widely available and easier to use. And where we can’t remove them all, we make them optional on as fine-grained a level as we can, so developers can pick and choose throughout their application!

Lock timeouts in BDB

November 30th, 2009

When somebody reports a bug in software you’re responsible for, the worst thing that can possibly happen is to find that it’s somebody else’s fault.

If it’s your fault, you see, then you can fix it! And since it’s a problem in code you wrote, it’s usually not too hard for you to figure out what’s wrong in the first place.

But if it’s not immediately obvious what could be going wrong, and the trail of diagnosis seems to be leading into a third-party component you use, then a dull, throbbing, headache sets in. Bugs in third-party components are hard often hard to trace (as you’re not familiar with their code, even if you have access to the source), and hard to fix.

So we were delighted when, upon approaching Oracle with the suspicion that our software was deadlocking might be Berkeley DB’s fault, they promptly replied with some suggestions as to what to try – and when we subsequently produced an isolated test case that demonstrated the problem, they came back to us the next working day with a diagnosis of the problem and a patch!

The issue was simple: we’re using lock timeouts as part of our deadlock-handling strategy. We want locks to timeout in the core server, but never in clients, as they may not be able to restart their transactions. So we set a lock timeout of a few seconds in the server (which never performs operations lasting more than a second or so anyway), and set a timeout of zero (meaning “never time out”) in the clients.

But we were seeing deadlocked systems, with the server lock failing to time out. The fact that db_stat showed the lock sitting there with an expired timeout pointed the finger of blame therein.

So we produced a small standalone application that used Berkely DB in exactly the same way we did at the point where it got stuck, and lo, managed to reproduce it.

Sending this test app to Oracle, we were amazed to have an explanation of the problem, plus a patch, within about three hours!

Concurrency bugs are all the rage!

October 27th, 2009

It seems almost fashionable to write about concurrency bugs these days. Books like Coders at Work and Beautiful Code contain harrowing accounts of Real Programmers finding subtle concurrency bugs in vast code bases.

Concurrency bugs are difficult to find, because they tend to involve the interaction between potentially widely-separated bits of code in unpredictable ways – and often only when several independent evens happen to coincide within narrow timing windows.

As we’re writing a distributed database here at GenieDB, we’re the kinds of people who are wise in the art of concurrency bugs, so we generally avoid them in the first place by only introducing concurrency where it is beneficial to do so. Obviously, the nodes in a distributed system execute concurrently, so we’ve kept the protocols for interacting between them simple and easy to analyse, and done our best to minimise state dependencies between them. This means that the ordering and timing of events should be unable to reveal unpleasant surprises. Within our software on each node, we’ve introduced concurrency only where we have to, in order to avoid blocking or to maximise performance. The interfaces between threads is very carefully marked, with the minimum of shared state; and we carefully reason about, and document, the expectations for managing that shared state, to avoid slip-ups in future.

However, this does not make us smugly immune to concurrency bugs; we fixed one yesterday that had been puzzling us for several days.

One of our daemons, when run within the test cluster harness, was sometimes dying while starting up. No core dump, no error message, nothing – it just disappeared midway through initialisation. We’d never had this problem before; it was only happening in the new Solaris port. Annoyingly, it was very timing-sensitive; running it under tools like truss or gdb made it always start up successfully, so we couldn’t see what was going wrong. Some of the logging we put in for tracing had to be taken out, as the problem never surfaced with it in place. It would always start up flawlessly from the command prompt; the problem only happened when it was run through our cluster testing harness, which ssh-ed into a group of servers and started the daemons on each.

But through a slow (and infuriating) process of deduction, we realised it was dying shortly after daemonising. In fact, the call to our daemonise function didn’t seem to return. Some careful logging of the return value of fork() led us to the horrifying conclusion that fork() would return in the parent – with a PID for the child – but it would never return in the child.

At which point it clicked. The process of daemonisation under UNIX is well documented in various places. The general procedure is to fork off a child process that becomes the actual daemon, then to terminate the parent. This means that the command that starts the daemon terminates, causing the shell or other process that started it to continue, while the daemon child process continues in the background. The child process then takes steps to isolate itself from the context of the parent (setsid(), a second fork, changing directory to the root, etc).

However, this standard process has a slight bug; for whatever reason, as it was timing sensitive, we noticed it on Solaris – but it was, presumably, just as present on Linux, the BSDs, and elsewhere.

Our daemon parent process was being invoked via ssh; so as soon as it terminated, the ssh connection terminated. If you are running a process in an ssh session and you close that session from your end – by killing the ssh process, losing the network connection, etc – then the ssh daemon on the server will send a “HUP” signal to the “process group” of the shell session. HUP is short for “hangup”, from the days of modem-driven dial in sessions; it is intended to tell a process that its connection to the user using it has gone away. By default, processes spawned from your ssh shell are all in the same process group; and by default, the HUP signal kills the processes; so if your ssh connection dies, then your processes on the server are killed off. Which is all well and good.

One of the steps performed in the child process of a daemon, immediately after it has been created by fork(), is to split away from the parent process group. This is so the daemon will not be killed off if the shell that spawned it happens to be terminated by a connection being closed; as a daemon, it does not interact with the user any further, and it should be a part of the system as a whole rather than the starting user’s session. However, it cannot form a new process group until after it’s forked, as the child process needs to exist before it can be assigned to a new group. So setsid() occurs immediately after fork() has returned in the child process.

What we were seeing is that the parent process forked and then terminated, and so the ssh daemon closed the session down, and thereby sent a HUP signal to the entire process group; but this all happened before the child process had ever had a chance to run. So it didn’t get a chance to break away from the process group. It did return from the fork() call – straight into the pending HUP signal, which kills it, despite the very next line of code being a setsid() that would have saved it.

Once we realised this might be the case, we made the parent process defined a handler for HUP signals that just prints a message, before it fork()s (so the handler is inherited by the child). Sure enough, the child process often received a HUP signal before getting to setsid(), immediately upon returning from fork(). So merely ignoring the signal, rather than letting it proceed to kill the process, made our problem go away.

So it seems the standard approach to daemonisation has a minor bug; there’s a time window where the parent process has terminated, but the child is not yet entirely isolated from the parent’s process group. Therfore, the child will be killed if the session terminates during this time interval. We only saw it because we were starting a daemon from ssh, which is a rare but not unheard of thing (particularly in automatically managed clusters). We suspect that the problem may have gone unnoticed in daemons that interpret the HUP signal as a request to re-read configuration files. If such a daemon happens to have installed their HUP signal handler before forking, then a signal arriving during daemonisation would just the “reload the configuration” flag to be set, causing a spurious but harmless re-reading of the configuration as soon as the daemon starts.

So here we are, developing a distributed highly-parallel database; and we find a concurrency bug in a standard practice that has existed within Unix for ages. Concurrency bugs can arise in the least expected of places…

Into the Cloud

September 28th, 2009

Cloud environments offer enticing cost savings for Web businesses, but they present some interesting challenges. One of these challenges is that clouds tend to provide large numbers of relatively lightweight virtual servers, which may potentially fail; high availability is meant to happen in your application, by designing it so that it can tolerate virtual servers coming and going.

This is great for Web servers, and with a little cleverness, it even works well for very simple key-value data stores. However, it’s not a great fit for the database layer; traditional database products rely on a small number of powerful servers, where one or more may be single points of failure, so would usually be implemented on highly redundant hardware.

Here at GenieDB, we’ve been spending the last few years busily implementing a high-level database designed to transparently operate on a cluster of unreliable small servers… So, we’ve been looking at how we can help in cloud environments.

Our existing core technology is already a perfect match; we already support zero-downtime dynamic reconfiguration of clusters (planned and unplanned), with access to the data via independent MySQL instances on each server. We provide the software to be run on each server, and provide instructions on how to seamlessly add and remove servers from a cluster without downtime.

But many users won’t have an existing cluster management infrastructure in place, and configuring each node manually is no way to run a cluster. So as well as documenting the low-level operations on nodes, we’ve written something we call Cluster Tool, which handles bulk operations on clusters. Given some software installation tarballs and a configuration file listing the servers in your cluster and how you’d like them configured, Cluster Tool will ssh into those servers, install our software, configure it, and correctly follow the processes required. It files away a copy of your configuration, so when it’s presented with a new cluster configuration (which may have added, changed, and removed servers), it can decide what servers need uninstalling, which need upgrading, which need reconfiguring and which need installing, to correctly migrate the cluster to its new state. We ship Cluster Tool with its source code, so you can use it as-is, read the source as a reference implementation alongside the low-level management documentation, modify it to fit into your existing cluster management practices, or just wrap it in your own software that feeds it new cluster configurations.

With these foundations in place, automating cloud deployments was easy. We produced Cloud Tool, which uses a pluggable backend to manage virtual servers from various cloud providers. To grow a cluster, it requests more servers, then adds them to the Cluster Tool configuration, and asks Cluster Tool to add them to the cluster. When they are no longer needed, it asks Cluster Tool to remove them from the cluster, then it requests that the servers be de-provisioned.

It’s as simple as that! And like Cluster Tool, it’s provided as source code, and only uses documented interfaces in the software its built on top of (in this case, Cluster Tool itself), so you can build upon it in your own ways, such as running hybrid clusters based on privately-owned hardware that expand out onto cloud servers during peak load.

Replication Lag

September 4th, 2009

The key idea in a replicated database is that you store multiple copies of your data, spread over several servers. The advantage is that when you want to read some data, you can pick a server (ideally, the very server your code is running on has its own local replica, so you can bypass the network), and do the reads there. This means that you can handle more reads per second by just adding more servers, and you can have copies of the data that are near to all the places you want to read data; this reduces latency, and can reduce bandwidth usage over your backbone if you have distributed sites compared to all your application servers talking to a centralised database server.

However, it has a cost: when you do a write, you need to update all the replicas. The naive approach is to require a writer to write to every replica independently; even worse is to have a “master” server that does it for them, as many popular replication systems do, since it then becomes a bottleneck. More advanced implementations implement a multicast system of some kind, ranging from actual IP multicast to diffusion over spanning trees.

But however you do it, replicating changes to a number of servers isn’t an atomic operation. When you issue that write, there will be a time delay before every server has it; and if there’s a network partition in progress, or a server is offline, that time delay could extend to days or weeks. So “synchronous replication” – where the write request does not return to the issuer as “completed” until positive acknowledgement has been received from every replica – is rare in practice; it makes writes very slow. On the other hand, “asynchronous replication” involves returning writes to the issuer as “completed” as soon as they’ve been received by a master server, or the multicast process initiated; this makes writes return quickly, improving the throughput of bulk writes tremendously. But it has a cost: programmers have to be careful.

A normal web page to allow a user to update their details will issue SQL queries to select the user’s details, then populate a form with them; if the user submits the form, we then issue an SQL query to update the user’s details, then continue as normal – fetching the data back from the database to redisplay it. This pattern avoids duplicating code, as the same logic is used to fetch the user’s details in each case; and it ensures that we will always show the user the actual data as it is in the database – so any trimming to length limits or other SQL-layer data transformations are accurately reflected.

But in a system with replication lag, this leads to problems. The user changes their details, then when they get the page back, it still shows their old details – as the change to their details had not yet replicated to the server the details were pulled back from to regenerate the page! This may not happen in testing, where the system is underloaded so replication is fast; but when the system becomes busy, replication lags can stretch to hours. The fun part is that the user may then think they did something wrong, and keep resubmitting their change until it “sticks”; this means that as the load increases, the rate at which updates come in increases by a factor proportional to how overloaded the system already is – creating a disastrous feedback loop.

Generally, applications need to be written to hide replication delays; when displaying data to a user, the application must manually account for recent changes the user has made, on top of what actually comes back from the database. This complicates development, hampering progress, and introduces plenty of scope for intermittent bugs that upset customers.

Which is why we developed proprietary technology at GenieDB to hide replication lag within the database itself, meaning that in normal usage, as soon as a write operation has returned to the application, subsequent reads (even from other servers) will return the latest information. Of course, in situations with network and server failures, this isn’t always possible, so you can still end up with some replication lag in extreme circumstances (alas, we can’t change the laws of physics), but except in emergencies, GenieDB hides replication lag from the application. We want to make it easy for applications to scale to replicated storage!

So it was with some amusement that, as we started setting up company profiles on popular professional and social network sites, we saw signs of replication lag. As staff members updated their personal profiles to state that they were part of the company, we’d go and refresh the company’s page to see if they had “done it right”, and lo, they would not appear yet.

So have we done something wrong in adding ourselves to the company? Or is it just replication lag? Should we assume the former and try again (risking breaking a working setup that just hasn’t replicated yet, and/or further loading the site’s database with unnecessary updates)? Or leave it for half an hour and see if the company page catches up?

As users of a site, we shouldn’t be having to think like this. And, of course, few users are replicated database professionals, so most of them won’t think like this; they’ll just think the site is broken if they encounter the artefacts of replication lag.

Distributed Systems

August 19th, 2009

As we are a company producing a distributed database, it should be no surprise that we’re big fans of distributed systems.

“Distributed software”, in practice, means software that runs on multiple physical machines, connected by a network. This has many benefits to the user; generally, there is little or no dependency on central points of failure, meaning that the system can continue to operate (at least partially) in the event of failure of any given machine or network link.

But we like to eat our own dogfood, which is why we use git as our version control system, and are currently setting up a VPN with n2n for people who are working from home to connect securely.

Read the rest of this entry »