Following up on exploring the differences between data, evidence, and knowing, and that understanding only occurs as you engage all three, I want to look at the considerations needed when designers interact with data. I believe these apply to data sets of any size really, though the stakes go up as the amount of data and people involved grow.
In general, the more data we have and the more access to it we have is a good thing. Much has been written about the possibilities and promises. We may be on the verge of discoveries and solutions that will amaze us. Cures we've been imagining for intractable problems may start to appear as we learn how to farm "the new soil." There's no question that we what we have is astounding and that we will learn amazing things. There's no question that this technological development will yield benefits to humanity beyond winning baseball and political campaigns. (Though not everyone believes that.)
There is a lot to learn, though, and a lot of issues to address. Failing to do that is going to be the source of many problems. Many of us will cause or run into those, ignorantly or inadvertently, but some of us can avoid them. And avoidance can start with what UX researchers and designers are good at: asking questions.
Many are already being raised about what, how, and why we're beginning to use large data sets. 'We The Data' wonders how data that is generated by a private individual gets handled. Stephen Sinofsky asks how do we view and validate business decisions based on massive data? The answers to those questions will profoundly influence our collective and individual well-being on personal and professional levels.
Other, less obvious, questions are the reasons that I'm no fan of the phrase "data-driven" that's being used in front of "design" and "decision-making." Those emerging business and engineering philosophies grant data a primacy and authority that is inappropriate. We have to face it. Data is stupid. Because it can be wrong, and even more because it is value-free.
To go back to the soil metaphor, the same rich dirt grows plants that sustain and plants that kill. The dirt doesn't care. Our data is the same. We will be able to see and justify the good and the bad from data. It will lead us to what is false as easily as it does to truth. The numbers will be there uncaring and aloof. We have to treat them accordingly and use judgement, context, and empathy to apply the right values.
What that means is that the data, the sources of it, and the questions we are trying to answer must be clearly thought through and assessed before we look for the answers. In the Sinofsky article above, he includes a section of questions that are a good start toward the evaluations needed. Steve Lohr of the NY Times notes that we need intuition and other judgment skills. Others argue for imagination.
To get at the right uses of aggregated data, I believe we need to list and answer questions like these when data becomes part of the UX decision-making process:
- Who are the people represented in this data and in what ways are they involved and affected?
- Are they aware of the data and what is being done with it?
- What say should they have and is it being given to them?
- What circumstances were the people in when the data was produced?
- What was true then that is not now and vice versa?
- What do we think matters that might not? What seems to not matter but might?
- Why are we asking the questions we are trying to answer?
- What other questions are behind those? Are there different perspectives about the data?
- Why are certain data sets being chosen to examine and are they the right ones?
- How do we know we have all the right data? Are our queries comprehensive enough in terms of sources and time? (example 1 example 2)
- What are our plans for the data possibly becoming wrong or irrelevant in a short time frame?
- What do we do when the data points in a direction that seems wrong or disagreeable?
The right available data should absolutely be part of a UX decision-making framework. But not the decision-maker. Good decision-making includes answering questions, considering assumptions and possible outcomes, making sure we have the right information involved, etc. And most of all it centers around caring about the people involved.
It is all too easy to let the data bear the responsibility of our decisions. (Great Recession, anyone?) It is easy to optimize for data we prefer to see, for what we measure. In other words, when we focus on certain data, especially when we label them "results", we tend to start to steer our thinking, processes and practices toward producing those results. If that focus was improper or off-target, we can create a flawed but self-reinforced system that will be difficult and expensive to change or dismantle, and could be extremely harmful if left unexamined or unchallenged. This very human and universal tendency must be consciously and conscientiously acted against to ensure we do good for design and beyond with our growing abilities to gather, analyze, and synthesize data into ever more useful knowledge.