Big Digital Data
Let’s start with defining what I mean by digital data. In the context of a website or web application, data is created by every web page, every click, every search, every share, every like, every purchase, every video viewed, in fact every single tiny activity on the web generates data. And all these are human activities; data is also generated when one computer or device talks to another. And all these activities multiplied by the 3 billion+ internet users and the amount of data being generated is unfathomable.
Most of us are only concerned with data generated by our own websites. But even for a simple website with modest visitor numbers, there’s still huge volumes of data generated. Fortunately, as the amount of data being generated has multiplied, so to has our ability to collect, organize and analyze it.
The amount of data available can quickly become overwhelming and it’s easy to get lost in it. But there are real insights to be discovered within, if you can just work out how to get at them. In their excellent book Lean Analytics Alastair Croll and Benjamin Yoskovitz do an excellent job of explaining how to just focus on the few valuable metrics that matter. They explain that some metrics really matter whilst other metrics are simply vanity metrics. A vanity metric might make us feel good such as total sales, but on their own, they offer little or no insight as to how we might improve. The metrics that matter are those that provide insights that incite change. And by change we mean changes designed to improve performance. No matter what our digital ambitions are, I’m sure we’re all interested in improving our performance.
Changing How Decisions are Made
Big data, as it’s often referred to, and our ability to read and interpret it, is fundamentally changing the way that business decisions are made. We can now rely less on pure intuition and past experience and more on empirical data driven decision making. With laser precision accuracy, a website performance can now be monitored. But more importantly, we can determine what aspects of the website are working well, and what is working less well. For example, not only is it now possible to accurately calculate the return on investment with regard to running a particular marketing campaign. But also, if it’s set up correctly, we may be able to identify from the data any common attributes shared between the visitors that responded to a campaign. If we were to discover a particular demographic that responded well, we can do more of what worked, and less of what didn’t.
The Rise of the Data Scientist
The data scientist is a relatively new role within organisations but one that is set to become more commonplace as data driven decision making becomes the norm. Essentially, there are two skills that a good data scientist need to possess. The first is to have the technical skills to harvest, structure and analyze the data in a way that can produce meaningful insights. And there are a huge number and variety of tools at their disposal nowadays. Tools to extract, process, summarize before even beginning their analysis. And with such vast amounts of data to deal with, one of the best ways to start making sense of it, is to use data visualization techniques. Seeing the data, is one of the emerging techniques that can not only cope with the massive data volumes, but also can help us identify patterns and trends that would otherwise be lost deep in the detail.
As important as having the technical knowhow and analytical skills is, the true value of a data scientist is in their ability to effectively present their results. It’s not only their talent to identify meaningful insights but, it’s also their ability to present their findings back to the business in a meaningful and digestible way, that make a good data scientist such a valuable asset. They can identify new business opportunities or save the business money by highlighting areas of inefficiency or waste. A data scientist can see things before anyone else can. They can see patterns in people’s behavior that can predict what’s going to happen next. In one extreme but well reported case, an American retail store exposed a teenage girl as being pregnant, purely by analysing her purchasing habits. In this case, the store knew she was pregnant before the girl’s own family did.
Like all specialist fields, data science can be broken into several specialist areas such as analysis and engineering. It comes with it’s own peculiar language such as data wrangling or munging. This particular jargon describes the process of turning the raw data into a format that’s more amenable to data analysis and the identification of patterns. The world of data science is strange and peculiar and can be seen as highly complex and technical. It IS highly complex and technical but it’s also the only way we can begin to make sense of the ever growing amount of data. Online, it’s the ability of a data scientist to track user behaviors and glean patterns and meaningful business insights that makes it such an important role.
I was once working closely with a data analyst who was charged with explaining why a large and high profile sales promotion had failed to achieve anywhere near the results that had been expected. At the time there was an ongoing political disagreement and people were pointing fingers. Blame for the failure was already being liberally apportioned between everyone involved. What stuck with me was what the data analyst said, as they were methodically crunching through the data. They said, “the data doesn’t lie, people do”. In that one quick, throw away comment, I think they summed up the power of data science.
Do you have any stories of great insights being discovered, deep within big data?