Is Big Data going to replace Enterprise Data Warehousing?

In my blog entry yesterday I concluded that Big Data as an acronym is on the rise and ISVs need to pay attention to this. The next question that one needs to pose is how is Big Data different from the traditional enterprise data warehousing? I still remember vividly the arguments 15 years ago whether Bill Inmon (considered the father of data warehousing) Top Down approach should be replaced by Ralph Kimball’s approach (Bottom Up) where the Enterprise Data Warehouse is built as collection of data marts that then together conform the enterprise data warehouse. There are also concepts such as operational data store, master data etc. Following link shows a couple of pictures that explains the difference in these approaches and a blog entry that explains pretty well the differences in these two approaches.

During my career, I have personally been involved with all and above and the latest implementation was based on SQL Server 2008 R2 with not only ETL logic to the ERP applications, but also a staging area, relational data warehouse and then the multi-dimensional OLAP cubes with SharePoint 2010. Needless to say, you need to have an understanding of multi-layer architecture and how all of this work together.

The question is how Big Data relates to all of this? One view of this is that different market segments sees it in a different way. Start-ups will see this more of a web-based approach with cloud solutions supporting Big Data. The SMB market has invested in Business Intelligence solutions and to get scale, they are going to look at cloud solutions that can take their analytics to the next stage. An then the larger enterprises that have invested huge amounts in enterprise data warehousing, data marts, ETL processes etc. will probably keep these solutions but might amend to cloud-based solutions when it is appropriate.

The competition in the Big Data space will increase during 2013 and we have already seen this by new solutions being introduced to the market like Amazon Redshift and Windows Azure Big Data. The distinction in the Big Data solutions is that many of them are typically based on NoSQL technology and data is dumped into computer memory (In-memory) and these solutions are specifically good for non-structured data. It is important to understand that there isn’t one “turn-key” solution as these types of Big Data implementations are both complex and require very distinctive skills to maneuver like “programming, statistics and how to visualize and communicate data”.

What we also need to remember is that the need to integrate data from different sources still exist, the data will be typically very different to what we are used to (like digital sensor and cameras) and when you add social media to all of this, you will have a mixture of data that never existed.

And finally, if you have been involved in Business Intelligence or Data Warehousing projects, the data/information still has to be presented in a format that makes sense for your audience, whether it be your management or other information junkies. What I do know is that analyzing the data won’t be easier than before given the fact that there is so much statistical swing into it, but the results of that data could take you and your company to the next level if information is used in proper manner.

To answer to the question I posed in my heading. No, I do not think one thing replaces another, but I would say is that you can expect to see multiple different variations on implementations and you can call them what you like and cloud will definitely be part of that implementation.

Business Analytics is on the rise again with Big Data leading the way

It is fun to see how some things will just continue being relevant. Business Analytics, Data Warehousing and lately Big Analytics are topping the charts. Based on my own feelings, Big Data really took off the second half of 2012 and we also included that in our business modeling workshops as one optional extension that software vendors (ISVs) should look at. Harvard Business Review brought Big Data to the forefront in its October 1, 2012 magazine with Andrew McAfee and Erik Brynjolfsson (guru whom I followed when I worked on my PhD) with an article “Big Data: The Management Revolution”. According to the authors, Big Data is far more powerful than analytics of the past, specifically in making predictions.

One of the key reasons for the sudden explosion if Big Data has to do with the urge to achieve competitiveness by getting a better understanding of your customer, its behavior and the only way to do this is to enable massive analysis of data and in the past, this has not been possible with on-premise environments due to scalability issues. With new cloud technology such as Azure Big Data, ISVs and end user organizations can scale up the analytics/calculations based on the need (in bursts) and scale down when the calculation is done. There are quite a few new interesting startups in the Big-data-as-a-service domain (Zoomdata, Bidgely, Ginger.io, AgilOne, Continuuity). I expect this trend to continue specifically as cloud platforms enable startups to innovate without having to invest huge amount of capital in hardware and use the elasticity of the cloud instead.

What I expect to happen during 2013 is that you will hear more about real cases of Big Data use and conferences such as BigData TECHCON appear on your radar screen. Big Data is no longer about if there is technology to do it, it is more about finding the people that understand it and how to utilize it. According to McKinsey & Company, there will be a shortage of 140,000 to 190,000 people with deep analytical skills as well as 1.5 million managers and analysts with the know-how to use the data to make effective decisions”. The McKinsey article breaks down the importance of Big Data very nicely, including things such as dealing with policies around privacy, security, intellectual property and even liability. There is a full report that can be downloaded from McKinsey web-site.

How does all this rely back to software vendors that I work with on a daily basis? If you are an ISV that deal with lots of data, you have to have a game plan for Big Data. Even if you do not care about it, your customers will be asking for it going forward. It is the same what has happened with the Cloud. Three years ago, the question about cloud was almost non-existent in many domains and today an ISV can’t really survive without the cloud. How about that as being a guiding factor for Big Data.

Personally I feel this is very exciting to me as Analytics, Data Warehousing, Business Intelligence has been my core domain for more than 20 years. Even my doctoral dissertation Evaluation of a Product Platform Strategy fro Analytical Application Software from 2004 is still relevant and explains the drivers that a software vendors should be looking at from a software product platform and software product line perspective. The link will download the dissertation (in English) and it is in PDF format.

Expect to hear more about Big Data from me during 2013 as it will be even more relevant than during 2012.