Saturday, September 2, 2017

Can We Learn Anything From Juicero?

At the risk of over-generalizing, I would think that in the security community most have long since fully internalized that

  1. it is all about managing risk
  2. 1 is hard
Now we have a new, and very public example of a risk management failure: Juicero, which ceased operations yesterday, costing at least 18 investors at least $118.5 million. 

There has been snark flying around about crafty, greedy venture capitalists deploying their collective foot-guns. I am more interested in looked at it as a risk management failure. I have at least one question, and possibly three.
  1. Do investment organizations talk to each other?
  2. If so, under what circumstances?
  3. If so, in this particular case, was groupthink part of the problem?

The Setup

Juicero was based around the idea that a high-margin kitchen appliance would make money in and of itself (especially if businesses were charged as much as $1,200), and sales of produce packs for $5 to $8 would lead to further profits. A useful analogy might be computer printer vendors, which today are basically in the ink and toner business (plug "printer ink costs more than gold" into your favorite search engine) back in the days when the printers themselves also turned a nice profit.

Bloomberg had a good roundup of the problem Juicero was having in April: Silicon Valley’s $400 Juicer May Be Feeling the Squeeze, including 
One of the most lavishly funded gadget startups in Silicon Valley last year was Juicero Inc. It makes a juice machine. The product was an unlikely pick for top technology investors, but they were drawn to the idea of an internet-connected device that transforms single-serving packets of chopped fruits and vegetables into a refreshing and healthy beverage. 
 ...
Juicero has managed to find a niche at high-end hotels and restaurants. Workers from seven businesses that own Juicero machines said they like the product because the disposable packs can be discarded with minimal cleanup. All seven said they didn’t know Juicero packs could be squeezed by hand. In Bloomberg’s squeeze tests, hands did the job quicker, but the device was slightly more thorough. Reporters were able to wring 7.5 ounces of juice in a minute and a half. The machine yielded 8 ounces in about two minutes.
and much more. Including a mention that sales of produce packs would be limited to owners of the machine.

The threat is that there is nothing to prevent the purchase of a single machine, then doing a bit of hand pressing where multiple machines might be required to handle peak loads in a juice bar without hand pressing. Groups could club together to buy a single machine, then all the produce packs they jointly wanted for hand pressing. Resales of produce packs are possible, etc. People are clever.

You can think of it as protocol design failure. Authentication did not provide the security that the creators of the business model and investors thought it did. This was a rather too-obvious failure on the part of Juicero, and we have an indication that it was in fact known by them. Again, from Bloomberg (really, go read that article):
Juicero declined to comment. A person close to the company said Juicero is aware the packs can be squeezed by hand but that most people would prefer to use the machine because the process is more consistent and less messy. The device also reads a QR code printed on the back of each produce pack and checks the source against an online database to ensure the contents haven’t expired or been recalled, the person said. The expiration date is also printed on the pack.
I don't know that there is anything preventing the machine owner from also verifying that it hasn't hasn't been recalled as well, before sharing, reselling, or whatever. Assuming that a re-seller even cared, which seems an unwarranted assumption.

Who Lost Money?

Bloomberg mentions Kleiner Perkins Caufield & Byers, Alphabet, and Doug Chertok (presmbly through Vast Ventures).

According to TechCrunch the funding rounds were
  • Apr, 2016 $28M / Series C
  • Mar, 2016 $70M / Series B (lead investor Artis Ventures)
  • Apr, 2014 $16.5M / Series A
  • Oct, 2013 $4M / Seed
That is $118.5 million. In that Series B round were 17 investors
  1. Abstract Ventures
  2. Acre Venture Partners
  3. AGO Partners
  4. Artis Ventures (AV)
  5. Bryant Stibel Investments
  6. Campbell Soup Company
  7. Campfire Capital
  8. First Beverage Group
  9. GV
  10. Haas Portman
  11. Interplay Ventures
  12. Kevin W. Tung
  13. Kleiner Perkins Caufield & Byers
  14. Melo7 Tech Partners LLC
  15. Thrive Capital
  16. Two Sigma Ventures
  17. Vast Ventures
Two of these were already mentioned by Bloomberg, but Alphabet (investment amount unknown) was not. So at least 18 investors were involved. I would expect Alphabet (as the parent of Google it is more than a VC), and Campbell Soup Company to have strong risk management teams. And any organization which exists purely to return profit on investment obviously should as well.

Yet we see a widespread failure.

A Comparison to Banking

Banks exist to manage risk for their customers: stuffing money under the mattress doesn't scale. And they cannot survive (without government bailouts, if they are large enough, and the circumstances dire enough) without functional internal risk management. They don't compete on the security front. Given the requirements of automated clearing, etc., they mitigate shared risks by sharing information about them.

Having no experience with one, I have no idea if venture capital organizations talk to each other about risk 
  • at all, 
  • only in the case of a group with a lead investor, 
  • occasionally, or
  • commonly.
It seems likely that whatever communication exists may be informal, dictated entirely by circumstance. But only by knowing those circumstances will it possible to answer my original question(s):
  1. Do investment organizations talk to each other?
  2. If so, under what circumstances?
  3. If so, in this particular case, was groupthink part of the problem?
It seems quite possible to me that answers might become available.

I hope so, because otherwise the only people who will gain useful knowledge from this are the investors who lost money. People in the general security community gain no insight on when to advocate for new lines of communication, with whatever warning of the risk of groupthink might be indicated by the Juicero example, etc. Tech, as a whole, has a very short memory for what has presented security problems in the past. But security workers still have a professional obligation to at least try

Thursday, August 24, 2017

Mean-of-Means Under Unequal Cardinality

These days, I'm hearing too much politics, of the weird and/or horrible sort. This is weird and/or horrible only in the sense of really bad statistics. Assuming that a political poll matches reality, anyway.

A former co-worker recently sent me an 'analysis' of a Gallup poll, purporting to show Trump's approval declining in stages. A lot of effort went into this. Each data point was tediously recorded via mousing over the online plot, then averages (arithmetic means) were calculated for each varying period, which were cherry-picked.

It's clear that while these data show declining popularity, and it sorta looks like that's happening in discrete stages, that's about all you can tell. Beyond the cherry-picking, only simple arithmetic means were calculated, and means-of-means were used across different sample sizes. Which is often a cardinality problem. I'm using the mathematical definition of cardinality here: a measure of the number of elements in a set.

In a bog-standard arithmetic mean I'm using the following notation:
where S = sample
and n = sample count
mean = S/n

so n = cardinality.


See the current Gallup Daily: Trump Job Approval plot to see rollover data, etc.

Immediately after the page title lies the critical sentence, "Each result is based on a three-day rolling average." You should also note that "Margin of error is ±3 percentage points." So a lot of the day-to-day variance is indistinguishable from noise. And not plotting even simple error bars trades information for drama. Bad dog, Gallup. But that is no worse than what is to be expected, and I digress.

Long story short: you have to use a weighted mean here.

The scary bit is that this former co-worker influences financial decisions related to capacity planninng for various security systems. When and where more storage, networking, or CPU would be needed. So this was just waiting to bite, in one or more security-critical system(s). That can be career-limiting, and is best avoided.

This is Python3. I use an unweighted mean-of-mean first, just to demonstrate failure under unequal cardinality.

Note that this is truly terrible Python. While it works, it bears little resemblance to how one would actually write Python. Python has been called "executable pseudocode."  I am trying to capitalize on that to make what it does (illustrate my point) decipherable to those who have never seen Python and have no desire to experiment with it, or may not be any sort of coder at all.

# Python is dynamically typed; it would have been easy to do this without 
# using an array. But this way, and specifying floating point (the 'f' in the 
# a_in = array.array statement) may be more explicit for those who don't use 
# Python.
import array # Efficient arrays of numeric values.
from statistics import mean # Arithmetic mean ("average") of data.

def clr_vars():
    """Clear variables, shown in order of appearance."""
    del a_in    # The input array.
    del ms1     # Mean of subset 1.
    del ms2     # Mean of subset 2.    
    del a_ms12  # Array of the means of the two subsets.
    del ma_in   # Mean of input array.
    del ms12    # Mean of both subsets

print('Equal Cardinality Subsets')
a_in = array.array('f', [1, 2, 3, 4])
print('Input:', a_in)
print('Subset1:', a_in[:2])
print('Subset2:', a_in[2:])
ms1 = mean(a_in[:2])
ms2 = mean(a_in[2:])
print('Mean of subset1:', ms1)
print('Mean of subset2:', ms2)
print('Sum of subset means succeeds:', ms1 + ms2)
a_ms12 = array.array('f', [ms1, ms2])
ma_in = mean(iter(a_in))
print('Mean of entire input array:', ma_in)
ms12 = mean(a_ms12)
print('Mean of subset means succeeds:', ms12)

Which returns

Equal Cardinality Subsets
Input: array('f', [1.0, 2.0, 3.0, 4.0])
Subset1: array('f', [1.0, 2.0])
Subset2: array('f', [3.0, 4.0])
Mean of subset1: 1.5
Mean of subset2: 3.5
Sum of subset means succeeds: 5.0
Mean of entire input array: 2.5
Mean of subset means succeeds: 2.5

clr_vars

print('\nUnequal Cardinality Subsets')
a_in = array.array('f', [1, 2, 3, 4, 5])
print('Input:', a_in)
print('Subset1:', a_in[:2])
print('Subset2:', a_in[2:])
ms1 = mean(a_in[:2])
ms2 = mean(a_in[2:])
print('Mean of subset1:', ms1)
print('Mean of subset2:', ms2)
print('Sum of subset means succeeds:', ms1 + ms2)
a_ms12 = array.array('f', [ms1, ms2])
ma_in = mean(iter(a_in))
print('Mean of entire input array:', ma_in)
print('Mean of subset means fails:', ms12)

Which returns


Unequal Cardinality Subsets
Input: array('f', [1.0, 2.0, 3.0, 4.0, 5.0])
Subset1: array('f', [1.0, 2.0])
Subset2: array('f', [3.0, 4.0, 5.0])
Mean of subset1: 1.5
Mean of subset2: 4.0
Sum of subset means succeeds: 5.5
Mean of entire input array: 3.0
Mean of subset means fails: 2.5

Weighted Means

As you can (hopefully, Python is clear enough) see, cardinality didn't matter with addition. But with mean-of-means, it did. There are some subtleties involved with sum- vs mean-of-means. You may be asking two different questions, depending on which you use, so either can be correct. But here we can just use weighted means, which allow for the different number of samples (cardinality).

Differences

Instead of iterating over the entire array, we group by the subsets.

Recall that in a bog-standard arithmetic mean I'm using the following notation: where S = sample and n = sample count mean = S/n

We calculate a weighted mean of two samples by expressing groups of sample counts and sample means:
Weighted mean = (((mean of s1)(n of s1)) + ((mean of s2)(n of s2))) / ((n of s1)(n of s2))

We introduce two new variables s1 = subset1, s2 = subset2, instead of just printing what they
would have contained.

We then use len() to get n of s1 and s2, storing them in two more new variables, ns1 and ns2.

Substituting Python variables, and using Python syntax (still with extra parens for clarity):
w_mean = ((ms1 * ns1) + (ms2 * ns2)) / (ns1 + ns2)

Here's the code.

s1 = a_in[:2] s2 = a_in[2:]
print('s1 is:', s1) print('s2 is:', s2) ns1 = len(s1) ns2 = len(s2)
print('ns1 is:', ns1) print('ns2 is:', ns2) ms1 = mean(s1) ms2 = mean(s2) print('ms2 is:', ms1) print('ms2 is:', ms2) w_mean = ((ms1 * ns1) + (ms2 * ns2)) / (ns1 + ns2) print('Mean of entire input array:', ma_in) print('Weighted mean succeeds:', w_mean)

Which returns
s1 is: array('f', [1.0, 2.0])
s2 is: array('f', [3.0, 4.0, 5.0])
ns1 is: 2
ns2 is: 3
ms2 is: 1.5
ms2 is: 4.0
Mean of entire input array: 3.0
Weighted mean succeeds: 3.0

There is another potential problem here. Not having the original data, we can't see outliers, and there is no methodology discussion beyond "Daily results are based on telephone interviews with approximately 1,500 national adults". So we don't know that the arithmetic mean was safe to use. On the other hand, rolling averages smooth data. It's probably fine, but we would have to do a lot of analysis to establish a confidence level.

We would start by experimenting with harmonic means (the mean of choice when
there are outliers, though we have to go outside the Python standard library. Harmonic means are also commonly required in IT capacity planning or performance analysis, or anywhere else that data are spikey. My former co-worker should certainly have been aware of this too, but somehow wasn't. Another way to get bitten... The thing is that in most ways, this a pretty knowledgeable security person. On the theory that anything that helps get more science and engineering into the security field, at any level, is A Good Thing, I was happy to help.

What's in the Python stdlib is actually pretty weak for this sort of thing, compared to what's available in the science stack. The only mean available in the statistics module is the arithmetic mean, and arrays are limited as well. The solution starts as follows.

import numpy as np # array goodies and vastly more, required by
from scipy importstats # supplies harmonic mean, and lots more
import matplotlib.pyplot as plt
import seaborn as sns
Note that you also have to have Pandas on the system to use seaborn. Which is fine --
you'll want it anyway. Or use something else; there are several plotting packages available.
When the environment is set up, we would use stats.hmean() and lots of plots as our
exploration tools. That would take more play time than I actually have. Especially since all
this assumes that I want to fool with anything political, and that Gallup polls are worth
anything to begin with. I would really start by looking up their results from the 2016 election,
where we know there were polling failures.
If you care to push your way into the science stack, the packages I mentioned above are at
http://www.numpy.org/ https://www.scipy.org/ https://matplotlib.org/ https://seaborn.pydata.org/
http://pandas.pydata.org/

Monday, July 3, 2017

Tools for HR Departments

The following is the text that went into gitlab.com/secinfo/hr.

Over the years, it’s become apparent to me that HR departments could use some
help on the security front. The cannonical example is a breakdown in
communications with IT that results of former employees being left on systems.
More subtle are issues with the hiring process: I’ve seen jobs advertised
looking for admins for old, unsupported, public-facing Web applications. And
too much detail about software which, combined with work shift requirements to
supply timing information, can provide useful information to an attacker.

Is it possible to increase precision, without a downside? One possibility
involves standardizing, if only somewhat, titles, responsibilities, and
typical tasks. Here the U.S. federal government may actually be of some help.
Though they don’t make it easy.

If you visit
https://niccs.us-cert.gov/workforce-development/cyber-security-workforce-
framework
(part of DHS National Initiative For Cybersecurity Careers And Studies) you
find tons of information. Seven categories, with a varying number specialty
areas under each, Drilling down further, you find lists of Competencies, Key
Skill Areas, Tasks, and Related Job Titles. And links to education resources.

There’s a lot there: hundreds of descriptors. It occurred to me that it might
be useful to have this in a machine-readable format, but that isn’t available.
Until now, anyway. The vast majority of it, save the education resources,
which seem likely to change frequently, has been (tediously) added to a couple
of JSON files. Warts and all – there are a few instances where they have
truncated lines and whatnot.

The files will change, in that some fields may be added. Under specialty, if
you open, say, Tasks, you find a top line that says something like
"Professionals involved in this Specialty Area perform the following tasks:"
Other times it says "experts".  I’m not sure it would be a useful thing to
include.

The structure will likely change at some point: I’ve committed the folly of
creating a data structure with no idea how it will be used. My hope is that an
HR professional will see it, it will spark an idea, and then an application
will drive the data structure, as it should. Which would make the tedium
worthwhile.

It’s also a serious eye-opener to security generalists such as myself, which
is another reason I did. JSON does make it possible to see it as a whole,
which you can’t do by clicking around in the NICCS  Web site. We are well past
any hope of a single person, no matter how dedicated, being able to  encompass
more than a small fraction of the skills that the field needs.

Hopefully, now that this is out there, I can use some of my scant spare time
to get some work done that should live in various private projects under
gitlab.com/secinfo/hr.