Friday, August 2, 2013

Things are really busy right now.

I have a new project: documenting what I did, and the rationale for the choices I made, for one of the recent data analysis projects. I always write docs, but this is more in the spirit of a HOWTO for people that need some basic instruction on how data analysis pipelines (or workflows, if you will) are commonly constructed on Linux, and does not depend on humans clicking around, enduring the horrors of statistics in Excel, etc.

It's mostly about pre-processing data, feeding only what you need into a sane database (why give up three orders of magnitude in speed for the bits that will not have relational queries run), when to do matrix math, when to fire a decision-support plot because a threshold has been exceeded, etc.

Somewhat at variance with physics pipelines, it was written in bash, Python, R, and Go. I should do a post about that. But like I said, things are busy right now.

Fedora 17 reached EOL on 7/30/13

Why does that matter? Well, Fedora is regarded as upstream of Red Hat Enterprise Linux. RHEL, and derivatives such as CentOS, ScientificLinux, and Oracle Linux (though Oracle will never admit that). What that means is that Red Hat chooses a moment to grab the current Fedora distribution, and make some of the various bits more robust. Meaning supportable, at sane cost structures. That forms the basis of the forthcoming Red Hat Enterprise Linux. Running a current, or near-current, Fedora provides insight into the oncoming RHEL, which will be RHEL7, by the end of the year.l

This is useful, particularly as this will be the most powerful RHEL ever, by a wide margin. Mondo cloudy stuff is going to be in there. That should be another post; it is by no means all marketing.

But right now, I have to rebuild some lab machines. This isn't a huge deal--that's what labs are for. But it will keep me busy for a bit, because I have to characterize what I am doing.

How do you secure this stuff?

As Frank Zappa once wrote, "The crux of the biscuit is the apostrophe." Secure what? Against what threat? At what cost?

I have never been a fan of PCI-DSS. The standard cannot change rapidly enough to reflect changes in the threat envelope. Compliance costs are out of control, and it is not clear to me that there is any rational means of choosing any particular solution to a PCI-DSS line-item. Sometimes I hate to even talk about PCI-DSS; there are other requirements in other industries that are more interesting (medical record security comes to mind), and some things (design flaws in cryptographic protocols, etc.) apply to any industry.

The basics apply in any environment. Control access, authentication, and authorization, and the majority of your risk goes out the window. This is doable, even via bash scripting. From a Director Information Security at Fiserv (Acumen platform)

"we did get our PCI-DSS ROC and the assessors loved the hardening scripts and the way you listed the hardening steps by control number."

Write a master script that calls subscripts by control number. The downside is that it adds complexity; you will be touching some configuration files more than once. It works, and assessors love it. You do however, need a capable and auditable version control and build system. Git works fine, if you bolt on some additional tooling.

The point is that RHEL7 will offer more controls--you will have more power to meet any standardization, legal, or regulatory challenge.

No comments:

Post a Comment

Comments on posts older than 60 days go into a moderation queue. It keeps out a lot of blog spam.

I really want to be quick about approving real comments in the moderation queue. When I think I won't manage that, I will turn moderation off, and sweep up the mess as soon as possible.

If you find comments that look like blog spam, they likely are. As always, be careful of what you click on. I may have had moderation off, and not yet swept up the mess.