Why Big Data isn’t 'Big' yet

21 Feb 2017

There is a growing belief that that big data, namely ever-sophisticating algorithms which sort, analyse and find relationships in mass databases, is a new kind of a competitive advantage. It is true: look at Amazon that makes individual product suggestions, or Netflix that learns the taste of its users and recommends content. Big data offers a revolutionary way to look into business, and hence it is portrayed as a universal problem-solver and insight-provider. And while big data is certainly already proved to be useful, it still does not live up to the promise.

Only 27% of C-level executives believe that data-mining has helped them make effective business decisions. In contrary, more than a third of executives reported that applying big data research findings only had adverse repercussions. The issue may lie in a fact, that although big businesses have taken in innovative tech, they did not adopt appropriate innovative thinking.

One of the beliefs about big data is that it can provide insights  without a preconceived hypothesis or specific data-mining agendas. Although, this could be true, such mind-set could lead to data interpretation bias. For example, big data research identified that there is  a negative correlation between the countries’ economic growth and amount of national debt. However, high debt does not necessarily lead to slower economic growth, meaning that mere correlation in data could be interpreted as causations. Similarly, no advanced AI at present is capable of understanding reasoning between correlations, which could potentially help to illuminate this bias. As a result, by overriding the traditional and time-tested strategic practices, the faulty data analysis may encourage awry decision-making.

Moreover, apart from considerate volumes and velocities, the big data by definition also must be ‘variable’ to gather most objective accounts and properly identify cause-effect relationships. This means that data must be gathered form variable sources and in differing formats. Nevertheless, IDC suggests that typically big-data driven enterprises only mine 10% of all the information that they could have acquired. It comes from predominantly quantitative sources, such as web analytics and CMS systems.

This means that business are missing out on the rest of the qualitative data, such as social media, forums, news, etc, which gives context to numerical information and makes it more valuable. Currently, the majority of analytical tools are quantitative, although certain qualitative software exists. It is not numerous, and still difficult to implement, due to the subtleties and intricacies of language that have to be manually examined. Natural-language processing software may be the answer, but the technology is simply not there yet. As a result, many big data initiatives are constrained by the knowledge and skills of the companies’ employees.

This takes us to the final point. While the big data cannot be interpreted without the specialised skill and expertise, there is a considerable lack of qualified personal. A study by McKinsey & Co suggests that only in the US there will be a shortage of 190,000 workers with analytical skills. Similarly, there is a lack of executive who are able to understand big data well enough to make lucrative decisions. Moreover, bureaucracy and a high power distance between data researchers and management impedes swift information transfers and hence companies' decision-making capacity. To overcome this would require a cultural and professional shift, encouraging new academic pathways, professional training programs and an overall increase in the public’s understanding of the big data processing. Arguably, this would overcome the mistrust to the new ways of technology-driven conduct and achieve truly prodigious results.

Nevertheless, it is also expected that through 2020, spending on big data technology will grow 4.5x faster. To happen this would definitely require a paradigm shift in the coming years.

We will see a shift from the glamour and idealized notion of big data to more practical and effective use cases. I expect that semi-structured data and machine learning will continue to drive the need for big data and having expertise in these areas will be critical. For companies to ultimately be successful, they need a clear business challenge to solve, they must start small and fail early, and they should explore the cloud before over-investing in unnecessary architecture.