Thursday, January 28, 2016

Microservices and Linters


Microservices are all the rage at the moment, for good reason. I am of course interested in the security aspects, and I am also on record as loving me some Python. Why, in detail, is probably something I should write up in a future post. For now, I'm just going to mention an intersection between the two.

In Chapter 9 (Security) of Building Microservices (Sam Newman, O'Reilly, 2015) we have exactly one mention of static analysis, under Baking Security In. Please don't misunderstand me – I found Building Microservices a worthwhile read. It is, however, a rather broad overview, and does have an unfortunate tendency to follow fashion. That last is probably not avoidable: the title has to sell, after all. And there is no possible way that all languages, strategies, etc., might be mentioned. So, no points off. Overall, author Sam Newman has done a nice job with this title.

That said, I still have a problem with static testing getting only one mention. Much has been written about developers needing to raise their game. It really is ridiculous that most injection attacks (SQL, LDAP, etc), for example, can exist in 2016. A lot of this is justified, but there are also some bits about QA, where that luxury still exists, that do not seem to get a lot of mention. Test-driven development can only take you so far, and an external QA group is a hugely useful defense against groupthink, deadline pressure, and other sources of problems in the delivery of reliable code.

Real developers could probably provide me with a lengthy list, and are invited to do so. You could further educate me with an ordered list – one can never have too much data.

Circling Back to Python

Back in June Andrew Collette (creator of much Python HDF5 code) wrote an excellent piece:
My Experience Using Static Analysis With Python. He was on Travis CI, and recommended both pyflakes (at minimum) and PyLint.

As it happens, we ended up using PyLint, which found about 100 legitimate issues with the code base, ranging from missing docstrings to calls to functions with the wrong number of arguments.
The takeaway is to use at least some sort of linter. Fine. That's doable, in either a full-on Continous Integration environment, or just using git hooks in a personal repo. Low marginal cost, better code. What's not to like?

Distributed Computing Is Inherently Complicated

Modern computing environments may consist of a single machine, comprised of multiple threads on multiple cores, many nodes in a cluster, or extremely parallel computing via GPU. In some cases (RDMA comes to mind), basic security mechanisms provided by Unix-like kernels are already being bypassed.

The need for reliable user-land code is never going to decrease. If even a minor improvement can be had by using something as widely known as a linter, yet that is not a universally accepted practice, then we are collectively Doing It Wrong.

UUID: d4b72b13-5dd1-46d1-913a-9dc470e0b6d7