Entering the analytics labyrinth
By Senior Product Manager Tolga Palaogullarindan
Analytics’ launch into the business world dates back to the late 19th century, long before computer systems were conceived. But the internet’s commercialization, which started in about 1995, and the subsequent introduction of web-based analytical applications have undoubtedly secured analytics in our lives for decades to come.
However, traditional web analytics is a thing of the past. It’s an ill-fated term everyone’s trying to avoid – by using cooler-sounding derivatives. Ones which make sense when coupled with a descriptor other than “web”, for example: customer analytics, predictive analytics, marketing analytics, product analytics and embedded analytics.
Now the industry is racing towards inclusive, and often vague, umbrella terms. The primary reason lying with the increasing complexity of user and business demands, and the secondary reasons being the methods and technologies built to meet them.
One thing is for certain, though. All of these definitions have two goals in common for the analytics market: understanding the business, and understanding its (internal or external) customers.
Web analytics is the forefather of what we “industry professionals” are currently occupied with. It served its purpose but eventually wore thin as the web went from being a company’s sole digital channel to just one of many.
So, the concept evolved. We needed inclusive terms to market our output, and that’s when we named the combined analyses of web, mobile and app data digital intelligence. Then that hit its limit, too.
The market is becoming customer-centric, thanks to the added data sources and connections which are integral parts of customer relationship management (CRM) systems. Designed to achieve a deeper understanding of the customers’ details and activities, CRM systems inspired yet another concept – customer intelligence.
Then there’s the business intelligence monster, with an even broader domain encapsulating the majority of aforementioned terms, and then some. It has a variety of unstructured, semi-structured and structured data sources, from enterprise resource planning (ERP) to customer relationship management (CRM).
Over time the industry established a plethora of analytic strands. But are they strictly separate? Or are they intertwined and entangled?
Lost in the maze of fuzzy definitions? You're not alone...
You are probably familiar with Gartner’s Data Analytics Maturity Model:
While this is certainly not a new model, it is nonetheless a good visualization of the evolution of analytics with difficulty over value. Of course you can argue that the relationship between these steps are not linear, or there is not an exact linear relationship between the stages (in the sense of one emerging from the other). But let's roll with it for now.
Knowing about hindsight and insight (What happened and why did it happen?) is of higher value than predictions (What will happen?). Case in point: Rolls-Royce’s Engine Health Management (EHM) system for their fleet of engines operating on commercial airliners worldwide.
It's not customer intelligence or BI, but how they describe it is a very good example of the Why did it happen? mindset:
“EHM covers the assessment of an engine’s state of health in real time or post-flight and how the data is used reflects the nature of the relevant service contracts.
"Essentially, EHM is about making more informed decisions regarding operating an engine fleet through acting on the best information available.”
I mention this to exhibit the difficulty of identifying terms that everyone can agree upon. There is no “one size fits all” in the analytics market – every organization has a unique approach to analytics and data science.
So how are we going to shape our product vision and stay ahead of the game without jeopardizing our product’s integrity? In other words, how can we avoid adding complex feature sets just for a handful of customers without turning our solutions into a graveyard of outdated features and capabilities?
Quite frankly, it’s a challenge for every analytics player in the market to avoid this trap, and will keep the Product Management teams scratching their heads for the foreseeable future. It is now essential to differentiate between “cool”, and technically practical.
The era of a hyper-connected society
We are in the era of hyper-connected customer. I would even say we are in the era of hyper-connected society.
According to a 2016 article by McKinsey & Company, “By 2020, some 50 billion smart devices will be connected, along with additional billions of smart sensors, ensuring that the global supply of data will continue to more than double every two years.”
Handling this tidal wave of data is a gargantuan task. According to GSMA Intelligence, there are currently more connected devices on Earth than people – around 8.2 billion and counting. We do have a very serious challenge up ahead, and in a way, we brought it on ourselves.
For the sake of an easy calculation, let’s assume on average these devices generate about 1 kilobyte of data (structure is irrelevant for this example) every hour (realizing full-well this is quite an oversimplification). That would be equivalent to 8,2 Terabytes (TB) of data every hour; ˜197 TB of data each and every day; 5,904 TB of data every month; and 2,154,960 TB of data every year, which is roughly 2,105 Petabytes (PB) of data.
The Large Hadron Collider (LHC) at CERN – the world's biggest and most powerful particle accelerator – produced around 500PB in 2015 alone. That is something for tech giants like Microsoft, Google or Amazon to chew-on, right? Maybe not. This bottomless pile of data is the beginning of our problem – it’s how we make use of that data which is the real question.
The inevitable result of a hyper-connected society
What exactly is Big Data? Even though it's nothing new, there is no clear definition.
Here are my interpretations:
1. An inevitable result of the technological advances in the digital age.
2. A deep technical problem that needs to be tackled by new methods (Machine Learning, Deep Learning, Neural Nets and AI) because, due to its sheer volume, no human can analyze it.
3. A tool with incredible potential to enable breakthroughs in scientific research and move society forward.
4. A dangerous vessel getting closer to stripping us of our freedom and privacy, enslaving humanity in a Minority Report-esque dystopia.
All are applicable to Big Data.
The adjective “big” creates debate around the whole term, leaning discussions towards quantifying it. However, it’s how we approach Big Data that is more important.
Big Data could be distributed, non-structured or semi-structured, and characterized as a cost-free bi-product of digital interaction. In other words, corresponding to every one of our daily digital interactions.
For example, swiping your badge your office entrance, or the data collected on how much you spent on groceries this month with your credit card. It’s not only personal interactions, but interactions with billions of devices (sensors, etc.) gathering data without us even noticing.
Generally, Big Data is used to detect existing and emerging patterns, making it the ideal candidate for predictive analytics. However, the sheer size of Big Data makes running analytics a herculean task due to the extensive computational power required.
Then again, don’t we humans excel with our creativity to rectify problems we created?
Webtrekk founder and our CTO Norman Wahnschaff says, “We do not need Big Data, but we certainly need Smart Data.”
At this point, some readers may have the following question in mind: “Data is data, how could it be defined as smart?” The answer, in fact, is not among the commonly accepted definitions of Big Data by Gartner, known as the 3Vs (Volume, Variety and Velocity), but within the later added definitions known as Variability and Veracity, which outline the consistency and quality of the collected data set respectively, and the potential impact on the accuracy of processing and analysis.
Why should we care?
So how does everything add up? Are the buzzwords flying around hindering our ability to see beyond the hype? Or are they almost intertwined? Should we be poised for the potential possibilities? Should we think about the real-life impact and implications? Should we be more grounded with our decision making by focusing on the key aspects rather than “cool” solutions?
I think the answer to all these questions is yes. As analytics vendors, our success and our products’ success relies on differentiating between “cool” and technically elegant. On our ability to see beyond the buzzwords and hype.
This blog post looks at some actual GDPR text to see exactly what is (and isn't) about to change. Read it now.