Remember back in the 1990s/2000s Data Warehouses were all the rage. The idea was to take data from all the transactional databases behind the multiple e-Commerce, CRM, financials, lead generation and ERP systems deployed in the company and merge them into one data platform. It was the dream, CIOs were ponying up big dollars for these because they thought it would solve finance, sales, and marketing most significant problems. It was even termed Enterprise Data Warehouse or EDW. The new EDW would take 18 months to deploy as ETLs would be written from the various systems and data would have to be normalized to work within the EDW. In some cases, the team made bad decisions about how to normalize the data causing all types of future issues. When the project finished, there would be this beautiful new data warehouse, and no one would be using it. The EDW needed a report writer, to make fancy reports, in a specialized tool like Cognos, Crystal Reports, Hyperion, SAS, etc. A meeting would be called to discuss data, with 12 people and all 12 people would have different reports and numbers depending on the formulas in the report. That lead to eventually someone from Finance who was part of the analysis, budgeting and forecasting group would learn the tool and be the go-to person and work with the team from technology assigned to create reports.
Then Big Data came along. Big data even sounds better than Enterprise Data Warehouse, and frankly given the issues back in 1990s/2000s the branding to Big Data doesn’t have the same negative connotations.
Big Data isn’t a silver bullet, but it does a lot of things right. First and foremost the data doesn’t require normalization. Actually normalization is discouraged. Big Data absorbs the transactional database data, social feeds, eCommerce analytics, IoT sensor data, and a whole host of other data and puts it all in one data repository. The person from finance has been replaced with a team of data scientists who are highly trained and develop analysis models and extracts data with statistical (R programming language) and Natural Language Processing (NLP). The data scientists spend days pouring over the data, extracting information, building models, rebuilding models and looking for patterns within the data. The data could be text, voice, video, images, social feeds, transaction data and the data scientist is looking for something interesting.
Big Data has huge impacts as the benefits are immense. However, my favorite is predictive analytics. Predictive analytics tells you something’s behavior based on previous history and current data. It’s going to predict the future. Predictive analysis is all over retail as you see it on sites as “Other Customers Bought” or recommending purchases based on your history. Airlines use it to predict component failure of planes. Investors use it to predict changes in stock, and the list of industries using it goes on and on.
The cloud is a huge player in the Big Data space Amazon, Google and Azure are offering Hadoop and Spark as services. The best thing about the cloud is when the data is absorbed in Gigabytes or Terabytes that the cloud is providing the storage space for all this data. Lastly given it’s in the cloud, it’s relatively easy to deploy a Big Data cluster, and hopefully, soon AI in the cloud will replace the data scientists as well.
Remember back in the 1990s/2000s Data Warehouses were all the rage. The idea was to take data from all the transactional databases behind the multiple e-Commerce, CRM, financials, lead generation and ERP systems deployed in the company and merge them into one data platform. It was the dream, CIOs...