Big Data Analytics: Let’s Peel the Layers!

DARPA Big Data
When we talk about Big Data analytics and my focus with my work on disruption in the world of large enterprise analytics, there are many elements to consider. Big Data is not just about large amounts of data but it includes the elements below:

I. The overall enterprise ‘data’ architecture. Moving from the old school SQL based, static-query, backend batch processing, large scale custom systems to an open ‘data’-architecture, utilizing new technologies, philosophies and processes. This is a major shift, much more difficult than many anticipate as many legacy systems are not only cash cows, but are tested systems with tons of nervousness to use new architecture/methodologies.

II. Distributed systems. While many large enterprises started adopting distributed data processing systems such as Hadoop several years ago, unlike say a start-up which started and grew up with Hadoop, a large enterprise has to ‘adopt’ and ‘integrate’ with Hadoop. Very few, if any, large companies have yet been standardized on Hadoop. So, another angel to consider.

III. SQL & NoSQL: Large enterprises were built on the relational database schema of SQL, upon which many legacy systems are built. The pressure of social media and the need to integrate with unstructured data, has forced (will explain more later why I am using a strong word such as ‘forced’) to integrate with NoSQL data structures. This is still very much an early stage process. It is another area where not being a native NoSQL environment poses challenges for enterprises.

IV. Analytics: Today, everyone has access to data and analytics is offered as a service. Long gone are the days in an enterprise that a line of business owner, has to wait 3-5 days for the business analytics group to respond to a simple analytics query. But, large scale analytics is still done in batch and not yet ready for the du-jour analytics engines which start-ups are offering. Still a gap. A start-up started native with a new analytics paradigm, yet we need to evolve the old enterprise analytics engine to the new paradigm.

V. Role: An enterprise has decades of roles and politics supporting titles and responsibilities. The large enterprises have to evolve rapidly to handle the swift current ever growing Big Data of the future.

VI. Real time: Big Data also has a real time nuance. We expect real time analytics responses on the fly as we work on our smart devices. Native applications have these things built in, but in an enterprise, when a real time response depends on linking the CRM, with the logistics and supply chain and tie to data from a global market, it is quite different, right? We can slap on real time analytics front ends to back end data, but without changing the architecture and integrating core functions, what you get is what you put in! remember the famous saying. Garbage in , Garbage out, and now even faster!

VII. Retraining: Newer companies such Yahoo, Twitter or Google started with these new paradigms and are evolving rapidly every day. Enterprises need to retrain tens of thousands of employees to adopt the new paradigms of Big Data and yet support all legacy systems (their cash cows). Hence, you have to support two or more parallel development and skill sets. That is assuming the large company has the budget and plans for retraining, as opposed to osmosis over time.

VIII. Acquisitions: Large enterprises often feel that the way to adopt new technologies is to acquire a start-up. We all know that often the cultural integration does not happen and just because I hire people with newer skills, I still have to deal with what I have internally. This is a key element which we will look into.

Over the next few blogs, we will uncover and discuss start-up players in the areas above, and why they are so critical in ‘driving’ innovation and change. We will explore, how these players will help change the landscape at large enterprises and stay tuned for exciting developments.