Not All Data Are Valid

Many people—probably with the exclusion of politicians—have come to believe data based decision-making is the way to effective action. In the words of Lord Kelvin, “to measure is to know” and so if our decisions and actions are to be directed by knowledge—not just by what we believe—then we must base them on data.  While this may hold some truth it is not true enough!


Accordingly, with the use of data—often referred to as analytics when mediated by computer technology—evidence-based approach to management seems to be the latest wave (or perhaps fade) in management practice.  Metrics are the sought after critical kernel to effective management decision-making.  This belief is often expressed as: We need to collect data because we need the numbers to tell us what action to take!


Deming was adamant in warning us of the cost from believing in the falsehood that if you can’t measure it then you can’t manage it, asserting “the most important figures that one needs for management are unknown or unknowable!” In spite of this warning, a less than critical use of data persists.  It is not that using data is wrong it is using data uncritically that is wrong.


Thought Can Deceive

The blind and uncritical embrace of data can mislead us into thinking that the numbers are ‘the thing’.  Just as we know that the map is not the territory, we must also be mindful of the fact that the numbers are not ‘the thing’ but rather they are abstractions of the constructs of concrete experience we are seeking to understand.  If we manage solely by the numbers then we will blind our self to the very essence of what we are seeking to manage, deceiving our self by thinking that the numbers are ‘the thing’. Again Deming offers us warning, “he that would run his company on visible figures alone will in time have neither company nor figures.”


Moreover, as Alfred Korzybski noted “a map is not the territory it represent, but if correct, it has similar structure to the territory, which accounts for its usefulness”, the same holds true for data and the associated constructs they (are intended to) represent.  So the fact that we measure and collect data—that we have numbers—doesn’t mean that the numbers we have are relevant or that they adequately represent the construct in which we are interested in understanding, that the data are valid.


That is, valid data is data that actually are representative or reflective of the intended construct we seek to make inferences/decisions about as well as the context within which it is being applied.  The data must actually be collected using an operationally defined and meaningful characteristic of the construct to which we are speaking.  Moreover, the data one collects must be applicable in the context in which it is being applied.  The issue is not whether data although not perfect are good enough, but whether the data are valid: If data are not valid then the data are not relevant!


A Construct Is Not A Variable

Many make the mistake of trying to directly measure or quantify constructs or concepts.  A common example of this mistake is when people (as customers or recipients of products/services) are asked to rate on a scale of say 1 to 5 the quality of __________ (e.g. instruction, service, product etc.).  Quality is a construct and as such it cannot be directly known through measurement.  Thus trying to measure quality in this way is an exercise in self-deception.  All that can be gathered are opinions. And there is a difference between measuring something and soliciting peoples’ opinions about that something; these are not the same and to treat them as the same is to commit a grave error!


So the fact that one can collect data—asking people to complete a survey—doesn’t mean that what is collected has any relevance, meaning or usefulness. The notion of the quality of something has many characteristics, and it is these characteristics when operationally defined as variables that lend themselves to quantification.  The variables are the map of the conceptual territory and thus to the extent that the variables are valid representations they can be useful. Hence if we are interested in measuring qualities of something then we must operationally define variables that re-present the characteristics of the construct of which we are interested.


It is About Relevance

So not all data are relevant and moreover that which is relevant is context limited.  Often people are asked to evaluate or assess whether this or that met their expectation or goals.  On an individual basis this type of question makes sense, as long as one knows what the person’s reference point or anchor is, after all this is the context of the response.  Each individual has his/her own anchor, his/her own context.


So to ignore the individual contexts and aggregate the responses from among many individuals across varying these contexts is tantamount to aggregating all types of apples, pears, plums and oranges.  Mixing context makes no sense and renders the data both meaningless and thus useless. Even data relevant in their own context when placed in different contexts can be rendered irrelevant.  Just because we can add numbers doesn’t mean their sum or average has meaning.  Data are not everywhere applicable and appropriate.


Again Lord Kelvin offers us thoughtful advice, “the more you understand what is wrong with a figure, the more valuable that figure becomes.”  This doesn’t mean that we can justify using whatever data we have in whatever way we wish by simply stating, it isn’t perfect but nothing is perfect so we can use what we have. Such justifications can’t make what is invalid valid!


When we apply data beyond its appropriate and useful limits then we are in effect misusing data; we are doing something that is a very harmful and destructive.   So it is imperative that we understand (and stay within) the limitations of the data we collect.


Apart from the above factors affecting the validity of the data one collects, there is also the fact that quantitatively derived knowledge is not the only way of gaining knowledge about things, particularly those things that involve human experience. That is to say, we must temper this enthusiasm for metric-based management with the wisdom inherent in the following (attributed to sociologist William Bruce Cameron), “not everything that can be counted counts and not everything that counts can be counted.”


More to the point, the most important aspects of what we are managing are not always knowable by measurement. To avoid being deceived by your thinking, think critically about what you are setting out to do.


The answers one gets is significantly influenced by the questions one asks.  Knowing the questions to ask is thus critical to having valid data.

3 thoughts on “Not All Data Are Valid

  1. Reblogged this on quantum shifting and commented:
    Reminds me of the move to apply “evidence-based practice” in social services some years ago, which always took me to the scene in Monty Python’s Holy Grail: “If she floats, she must be made of wood and if she is made of wood….a witch!” The message that “not everything that can be counted counts and not everything that counts can be counted” is so crucial in today’s world.

  2. Pingback: Management Improvement Blog Carnival #167 » Curious Cat Management Improvement Blog

  3. Pingback: Stupid Is, As Stupid Does | For Progress, Not Growth

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s