Thursday, May 7, 2015

Sharing is Complicated

Commentary::Internals
Audience::All
UUID: bd74c00b-02cd-42b4-8d62-514dfab4b217

There are a lot of things I want to share, from images to code. Roadblocks are often unexpected, and can be weird as hell e.g. file-naming issues with my camera that began at the same time that I modified the copyright information that is stamped into EXIF data. The solution to that probably involves adopting something like the UC Berkeley calphotos system http://calphotos.berkeley.edu/, and writing a bit of code to support a new pipeline. Also known as a workflow, and which term is used is suggestive of many things. But I digress. Most popular articles (and at least some software) related to how to image storage and retrieval are overly simplistic. Duh. In other exciting news, the Web has been found to be in dire need of an editor.

Sharing documents (specifically including code) is also an issue, and one that is a bit more important to me at the moment.

I don't want to get into the version control Holy Wars. Use git, mercurial, subversion, or even one of the proprietary systems. Whatever. If I had to guess, it would be that how well you can use the tool will in most cases outweigh the power (and idiosyncrasies) of the tool.

That said, this is about github, because this post is about sharing.

Github suffers, periodically, from DDoS attacks, which seem to originate from China. I say 'seem to' because attribution is a hard problem, and because US/China cyber-whatever is increasingly politicized, and this trend is not going to end any time soon.

Points to Ponder

a) Copying of device contents as border crossings are made. There have been State Department warnings on the US side of the issue, but at least one security actor, justly famous for defeating TLS encryption with the number 3 (that is not a joke, search on Moxie Marlinspike), has been a victim as well. There is some question as to whether my systems could be searched without a warrant, due to my proximity to an international border. Nation-states act in bizarre ways, the concepts of 'truth' and 'transparency' seem to be a mystery to national governments, and I do not regard it as impossible that the US would mount a DDoS on GitHub, if a department of the US government thought it both expedient and deniable.

b) Is China a unitary rational actor? On occasion, acts of the Chinese government seem to indicate a less than firm grasp of what other parts of the government are doing. A culture of corruption is one issue, but there are others, such as seeming amazement at adverse western reactions to an anti-satellite (ASAT) missile test back in 2007. Which was apparently quite the surprise to western governments, and makes me question what all of this NSA domestic surveillance effort really accomplishes. I won't digress into that can of worms, other than to note that there is much evidence suggesting that the US may not be a unitary rational actor, either.

Circling Back to GitHub

The entire point of a distributed version control system, of whatever flavor, is availability. Yet there are trade press stories dating back a couple of years, at least, about widespread outages due to DDoS attacks. The most recent one that I am aware was in April of this year. In every case, much panic and flapping of hands ensued. Developers couldn't work. Oh noes!

That rather blows the whole point of GitHub out of the water, doesn't it? The attacking distributed system beat up on your distributed system. Welcome to the Internet Age, cyber-whatever, yada yada yada. Somewhat paradoxically, a good defense involves more distribution, and not allowing GitHub to be a sole point of failure.  

The problem is pipelines. Or, again, workflows. A truly resilient system needs more than something that has demonstrably had accessibility issues for years, and the problem is two-fold.

1) There is no fail-over.
2) The scripts that drive it all tend to be fragile.

It is entirely possible to build a robust system, hosted in the DMZ or in the cloud, as a backup to GitHub. Most of this is just bolting widely available Linux packages together, and doing the behind-the-scenes work. With an added component of writing great doc; the system will only be exercised when things have gone to hell, and everyone is stressed. If there were ever a time where great doc were a gift from $DEITY, this would pretty much be it. Because Murphy is alive and well, so some periodic fail-over test (you do that, right?) probably got skipped for some reason.

At this point I am going to be polite and just mention that the DevOps community might do a bit more work in getting some best practices information to the public. If GitHub is more important than just free hosting (and it may not be, for completely valid reasons) please build an adequate system. It will save you from having to publicly whine about how your distributed system did not turn out to be resilient.


No comments:

Post a Comment

Comments on posts older than 60 days go into a moderation queue. It keeps out a lot of blog spam.

I really want to be quick about approving real comments in the moderation queue. When I think I won't manage that, I will turn moderation off, and sweep up the mess as soon as possible.

If you find comments that look like blog spam, they likely are. As always, be careful of what you click on. I may have had moderation off, and not yet swept up the mess.