Today, it is hard to escape all the chatter about Big Data. As Google chief Eric Schmidt said, "While it took from the dawn of civilization to 2003 to create five exabytes of information, we now create that same volume in just two days!" Although Big Data is all around us, the reality is that only a small fraction of CIOs are tackling this head on. Our experiences working with large organizations on Big Data projects suggest that there is frustration in organizations trying to decide what the best course of action is in this brave new world. There is a lack of vision and a fear of making mistakes. In order to make the transition easier, we believe there are a few fundamental rules that should govern Big Data plans:
1. Invest in the right skills before technology.
More important than technology is having the right skills, of which three are distinctly required:
- The ability to frame and ask the right business questions, with a clear line of sight as to how the insights will be used. Big Data is noisy and plentiful. The ability to crystallize a business problem and not boil the ocean is critical to being able to generate rapid and relevant insights.
- The ability to use disparate open source software to integrate and analyze structured and unstructured data. There is no single Big Data tool that does it all, meaning one must bring together the best breed of tools to get the job done. In addition, the landscape of tools and technologies is rapidly evolving; locking oneself into a proprietary solution would be very risky. Open source software is the recommended approach and the ability to understand a diverse set of technologies is important.
- The ability to bring the right statistical tools to bear on the data to perform predictive analytics and generate forward-looking insights. The holy grail of Big Data is to be able to predict the future with a high level of certainty. Given data (big or small), the art of reliably predicting the future requires a fundamental knowledge of disciplines like statistics and machine learning. One has to be able to parse the signal from the noise, even more so in the Big Data world where noise is abundant.
These skills can be developed proactively both by training and hiring. For example, find those in your organization who are good at skills 1 and 3, and have a penchant for 2. Give them the opportunity to play Big Data steward. Hire individuals who have strong training in 2 and 3, and who show a penchant for business applications. It is also important to find senior leaders in the organization who not only believe in the power of Big Data, but are also willing to take risks and experiment. These leaders can play a big role in driving rapid adoption and success of data applications.
2. Experiment with focused Big Data pilots
Many of the Big Data conversations today originate from technology vendors, yet these conversations have little, if anything, to do with the business case and ROI of Big Data. Start by identifying the most critical business issues and identifying how Big Data may contribute to finding solutions. Bring various sources of data into a Big Data lab where these pilots can be run before major investments in technology are made. Big Data labs offer a collection of various Big Data tools (e.g., text and speech analytics software, Apache Hadoop, visualization software), and expertise (predictive analytics, machine learning, vertical knowledge) that allows businesses to run pilots and prove value quickly without making significant investments in talent and IT. These efforts can be implemented at the grassroots level with minimal investments in technology.
3. Find the needle in the unstructured hay
Semi-structured and unstructured data is top of mind among organizations. As Gartner highlights, enterprise data will grow by 800 percent over the next five years and 80 percent of this information will be unstructured. There are three important principles to keep in mind with unstructured data:
- Ensure you have the appropriate technology to store and analyze unstructured data. Non-relational technologies (e.g., Mongo) that are schema-less and scale horizontally are needed. Also, ensure you have access to or are licensing technology that can analyze unstructured data (e.g., NLP engines for text, voice analytic technology, speech to text transcribers, social graph analytic software, and machine data analysis tools).
- Prioritize and focus on the unstructured data that can be linked back to an individual and prioritize the unstructured data that is rich in sentiment and informational value. This is the data that is most likely to yield the richest insight. When it comes to text and speech (e.g., call center recordings, social conversations), the customer sentiment embedded in those can be highly indicative and predictive of future customer behavior, since it provides insight about the "why" beyond just the "what."
- Do not just analyze unstructured data. Extract relevant signals from this insight and combine with structured data to turbo-charge business insight and prediction. For instance, knowing that a high-value, super engaged customer just expressed the need to buy a new product to a representative over the phone demands a different action than knowing that a low-value, extremely price sensitive, and disloyal customer expressed a need for the same product.
These three principles need to be the focus, NOT saving and storing petabytes of unstructured information forever.
4. Data poor, insight rich is much better than data rich, insight poor
The risk of data and analysis overload without commensurate actionable insight is at its peak. Many organizations have never acted upon the information they already have, even before the world of Big Data. Those who have generated insights have barely scratched the surface of being able to implement and act on those insights at the frontline, where they really matter. The challenge has been that those who are generating insights and knowledge in the organization were far removed from those responsible for operationalizing those insights at the frontline. Generating meaningful insights and acting on them should be the first order of business. It is important to think about Big Data projects holistically, all the way from collecting, aggregating, and mining the data to operationalizing the insights.
5. Think operational analytic engines, not just analytics
One of the potential benefits afforded by Big Data is the ability to tailor experiences to customers based on their most recent behavior, and therefore be more relevant. To make Big Data a competitive advantage, companies can no longer extract last month's data, analyze it offline for two months, and possibly act upon it three months later. Take the case of a high-value, loyal customer who enters a promotion code online at checkout and the discount is not applied, leaving her dissatisfied and likely to attrite. Being able to act on this insight within a few hours and get back to the customer with an apology and a credit will go a long way in retaining significant customer equity. Businesses need to shift their mindset from doing traditional offline analytics to building technology-powered analytic engines that enable near-time or real-time decision-making. We recommend that companies take a measured test and learn approach. Take 20 percent of your decisions and enable them with technology-powered analytic engines. Measure success and slowly increase the percentage of decisions enabled this way as the organization develops a greater level of comfort.
6. Adapt organizational processes to take advantage of Big Data
The Big Data world enables one to act in near- or real-time. However, many organizational processes are not prepared for this shift. Taking advantage of Big Data is not just about people and technology, but also the processes behind data collection, insight generation, business decision making, and insight application. These Big Data rules can enable competitive advantage by generating more comprehensive customer insights faster and enabling real-time action based on those insights.