Big Data #11: What Big Data Doesn’t Do


Big Data was one the biggest hypes in 2012 and more or less continuing its glory today. Many corporations rushed to embrace it hoping that it will solve the long-standing business problems and make the corporations more competitive. This enthusiasm inevitably brought heightened expectations, which would never materialize given the correct effort.

First and foremost, big data won’t solve your business problems. In fact no computer system will solve them. Given that the right questions are asked and right amount of thought, planning and execution goes into answering them, you can see what your problems are. Your skilled employees will use the data, analyze it, discuss it, come up with solutions, iterate, repeat and then solve these problems, not your big data analytics software.

Big data will not improve the quality of your data. If you do not have quality data, you cannot expect the system to bring you quality analyses. With carefully designed forms, which has predefined input lengths and validation rules you get pretty much clean data, which is not the case with big data. When you have data coming from various sources such as -say – social media and sales figures then you will definitely have a headache if you do not clean it up.

ALSO READ:  Difference Between Dedicated Hosting and VPS Hosting

Similarly, it will not organize or structure your existing data nor it will help you manage it. According to IDG figures, about 90% of the enterprise data is unstructured, and is growing exponentially. If your company does not have policies for organizing, using and retaining the data, further introducing big data will be a mess at best.

Furthermore, introducing big data will create more noise to your existing data, as much as up to 90-95% (my experience). It is the 5-10% of the data that you should expect to make a real difference in your analytics. Do not assume that with big data you will have less to deal with.

Without doubt big data will have an impact in your data center. Big data systems differ from the traditional transaction and data warehousing systems. Given the required bandwidth to receive the big data and the large compute clusters at the very least you will have to rethink about storage, cooling and hardware in your data center.

ALSO READ:  Big Data #1: What’s Wrong?

These changes in the data center will not solve your IT department’s problems either. Since there is a change in the data center, you cannot expect your system and network administrators to have the data management and analysis skills without any training. Nor you cannot expect them to work business-as-usual with the specialized big data hardware. Infrastructure monitoring and configuration, server management skills will have to be rethought of and proper training has to be organized for your existing IT team.

Big data will not eliminate your legacy systems nor decrease their value. In fact they will make them even more valuable because they are the systems that are providing critical business data. A state of the art big data system will not replace your – say – existing CRM system. Given the right thought and the methods, your big data system use the values from your CRM system.

ALSO READ:  Umbraco: The Microsoft .NET-based Open-source CMS

And finally, your big data systems will not give you the correct answers in every analysis. In fact many of the results you receive will be inconclusive. Think about the genome sequencing and weather forecasting as two examples where years of stored (and presumably clean) data are constantly analyzed, rethought of and reanalyzed. And there is no exact answer yet. Your big data investment may not be that “big” but there is tolerance for inconclusiveness. This will also have an impact on the ROI considerations (personally I recommend using big data processing utilization as an ROI metric).

As I have always emphasized, the management has to communicate clearly. In case of big data it is about the investment – what is expected of it, what it can do and what it cannot. Otherwise the expectations and the deliverables are not likely to match.

Image credit:

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

3 thoughts on “Big Data #11: What Big Data Doesn’t Do