Distributed Systems

As we are a company producing a distributed database, it should be no surprise that we’re big fans of distributed systems.

“Distributed software”, in practice, means software that runs on multiple physical machines, connected by a network. This has many benefits to the user; generally, there is little or no dependency on central points of failure, meaning that the system can continue to operate (at least partially) in the event of failure of any given machine or network link.

But we like to eat our own dogfood, which is why we use git as our version control system, and are currently setting up a VPN with n2n for people who are working from home to connect securely.

It’s possible to use Git like Subversion, with a big central repository that everyone pushes into, and pulls changes down from. But that’s not taking advantage of its strengths. Sure, we have a server on which we keep the “mainline” release-candidate branch so that everyone in the company can pull from it to keep up to date, and which acts as the interface between the development and QA teams. But when we’ve written some code that’s not ready for the mainline, because we want somebody else to review or test it first, we can just ask them to pull from our personal repository.

Sure, we keep a set of personal repositories on that central server, so we can push up to there from our laptops then switch them off and people can pull our personal changes at their leisure, but that’s just a convenience – when we’ve been isolated from that central server due to networking issues, we’ve just pulled direct from each other’s laptops. But that’s not so easy when we’re behind different NAT routers; we’ve had to mess around with tunnelling ssh through employee’s personal hosting servers.

Our new n2n VPN will make that easy. Traditional VPN technology is really just a “virtual private cable” rather than a “virtual private network”; it provides a single remote laptop with secure access to a central private network. It’s possible for multiple users to connect to the same private network, and then they’ll be able to securely communicate with each other – but only via the central private network. n2n, on the other hand, uses a directory server with which endpoints register; when they want to communicate with each other, they make direct connections, using the directory to find out where the endpoints are connected from in the physical network. You can run multiple directory servers in different places, to be fault tolerant, and since they contain no long-term state, you can even stick up a temporary one somewhere and get everyone to register with that if things go really pear shaped (or a bunch of us are at an event or customer site with networking between ourselves, but no Internet access; we can run directory servers on our laptops).

As with git, you can do what the centralised systems do, too; you can have a router or an Ethernet bridge connect to the n2n network as an endpoint to offer access to a central private network. But it also means that when we’re all in different places, connecting from our laptops, we’ll be able to pull changes from each other even if the central server (or, more likely, the office ADSL!) is down.

We have an internal philosophy that central points of failure are bad; as we manage to find more ways to bring that philosophy into our working practices as well as the product we build, the places where existing technology forces us to centralise things unnecessarily become more and more obvious – and more and more annoying…

Leave a Reply