Looking beyond Known Knowns

I’m back on an old soap box of mine; considering the all too-limited value that most businesses squeeze from their data. I could have titled this entry “Knowledge vs Data”, or “Analytics vs Reporting”, or “Comprehension vs Regurgitation”.

It’s that simple. Far too many organisations repeat unquestioned learnt behaviours, churning out an endless sea of reports, and far too few are able to address the pressing questions best expressed by Donald Rumsfeld’s oft-chided “Known Unknowns” and “Unknown Unknowns”. Despite my predilection for erudite prose, I’m a fan of his explanation of the situation he found himself in; and despite the initially clumsy appearance of the language he used, the more it is considered the more accurate, complete and concise the statement becomes:

There are known knowns; there are things we know we know.
We also know there are known unknowns; that is to say, we know there are some things we do not know.
But there are also unknown unknowns – the ones we don’t know we don’t know.

Think about this for a moment, in the context of your organisation.

Known knowns

There are, undoubtedly, things that you know about your business. You know, or can look up in an instant, the value of sales last month; the number of sites you operate from; the volume and value of stock held in a warehouse; next month’s payroll cost; and the number of employees. These are all known knowns. You know that the business knows them, and you will have a number of simple reports which present these numbers in easily-read, familiar forms. Business reporting can represent this simple numeric knowledge very well – it’s easy to look up a single value in a tabular report, or read it from a simple dashboard, and feel confident that it’s a fair reflection of this one metric of your business.

Known unknowns

Then there are those questions which don’t go away, and for which the answer is often changing; questions where a single number doesn’t really provide an answer, and where context is required to assess the true nature and value of an answer. How many of these look familiar?

  • Who are our most-profitable, best-paying and easiest-to-serve customers?
  • Which products are our most reliable, easy-to-make/buy/deliver, and rarely get returned?
  • Which salespeople consistently operate above the average, and continue to grow their value quarter-on-quarter?
  • What types of promotion generate the best response from our customers, how much extra profit do they generate, and what is the effect on customer behaviour after they finish?
  • Which store format generates the best return per square foot, and how does this vary between regions?

These are known unknowns; questions that you know to ask, but have complex, subtle and changing answers. Many of the responses need a caveat; “This month, the best is…”, “Over the last year, the answer would be…”, “Generally, the answer is…”, “In the north, we always find…” etc. So much depends on the context of the question, how it was framed, and when it was asked – answers often vary by time, location, product range, or by customer segment.

Addressing the questions, and providing rich, contextual answers, is futile using traditional reporting approaches; yet it’s often attempted. It results in an industry-within-the-business, with employees stuck in endless report definition, customisation and generation loops. How much time is spent in your business defining reports or waiting for them to generate? Hours? Days? Weeks? Months? Quite possibly man-years if your business is large and collects a fair amount of data in its business systems.

Known unknowns are addressed through analytics – effective manipulation of large volumes of data, using modern tools which are designed for exploration rather than repetition. My own particular preference is for Visual Analytics; exploring, merging, calculating, presenting and communicating patterns in data through interactive data visualisation. There are other forms of analytics; from statistical data mining, through algorithmic approaches to predictive analytics. All forms of analytics deal with the business of answering questions; analytics is the science of logical analysis, meaning that it;s a process for testing the viability of hypotheses (ideas), given the evidence (data) available. In business, this means that we can use analytics to address complex business questions and look for evidence for and against pre-formed or emergent ideas.

Visual Analytics appeals to the Data Animator because it offers a highy visual means of exploring data, allows for rapid exploration of large data sets – where every step yields both a compound understanding of the problem domain, and a better set of questions to ask – and leaves in its wake a series of picture postcards which allow the storyteller in me to describe my new understanding to an audience; providing rich, complete answers to known unknown questions, full of context and easily communicated to – and understood by – the questioner.

Unknown unknowns

What’s more, the Visual Analytics process occasionally yields that most elusive of fruit; the unknown unknown. Sometimes, whilst searching for (and measuring the completeness of) evidence for a hypothesis, an entirely counter-intuitive pattern is discovered. Such surprises prove rich fertiliser for conspiracies and, as such, further questions materialise almost immediately. In many cases, the surprise turns out to be a false alarm – a wrinkle in the data, or an exceptional outlier from normal behaviour, but from time-to-time a genuine unknown unknown is uncovered; something that the business genuinely did not know, and in an area where it had never thought to look.

These are relatively rare treasures, but often of tremendous value, and fuel the desire to dig deeper, search further and push exploration to new levels. The budget airline which has a small but valuable seam of high net worth customers, ready to be seduced with additional product offers. The youthful fashion retailer with a hardcore of middle-aged fans paying over the odds for basic staple items – ready and able to spend significantly more if the size profile is adjusted slightly. The grocer with weak convenience stores due to supermarket range dilution and under-representation of premium own-label products. All of these examples were:

  • Counter-intuitive – the pattern uncovered was never explored because it went against conventional thinking, and no-one had thought to question the received wisdom
  • Genuine surprises – uncovered accidentally whilst looking for something quite different
  • Highly valuable – all were worth millions of pounds in extra sales, and most flowed through to operating profit with little or no impact i.e. these were highly profitable insights

Reporting – even the finest visual reporting – will never uncover unknown unknowns. It can’t. Reporting requires the definition of metrics to gather and present, not the exploration of patterns in order to evidence ideas. If you want to find the surprising, the new, the unimagined or the unexpected then you must turn to analytics in order to achieve your aims.

If you’re excited by the idea of uncovering rich new streams of revenue and profit in your business (and you should be, shouldn’t you?!) then you need to be using analytics to explore your data – you would be amazed at what it can yield.


3 responses to “Looking beyond Known Knowns

  1. There is also the point of raising hypotheses, where no matter how outlandish they may be, one raises the types of questions that one intuitively knows can break into the unknown unknowns. So while serendipitous discovery through any analytical process finding unknown unknowns is definitely a bonus, one should not underestimate the point of thinking up questions, situations, problems based on one’s own area of expertise and then testing and checking these against the data one has. This is also known as a core of the scientific method. This being one of creative and insightful looking where one hasn’t looked before as much as simply gathering data and seeing what one sees with it. There is some evidence from the data warehousing market where common massive failure resulted from investing in data for data sake – the idea that we just collect everything and randomly look around (early data mining) so that something may come out. But this is blind and hopelessly expensive. Gary Klein “Sources of Power” is a good source for the intuitive basis underlying many of our questions and actions based on one’s years of experience.

    • Pete – thanks for this, and I agree on the need to apply scientific method, as you suggest. I am fortunate to work frequently with an extraordinary intuitive thinker who is exceptional at posing challenging questions, and spotting potential patterns. In such cases, however, it is important to remain a skeptical empiricist (to hijack Nicholas Taleb’s self description) – ensuring that searches for both evidence for and against the hypotheses are given equal time, attention and effort; it is too easy to spot apparent patterns in randomness (the biggest potential drawback of Visual Analytics)… a theme worthy of a future blog.

  2. Pingback: From HBR: IT fumbling analytics | Data Animator·

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s