Sunday 25 November 2012

Data Warehouse in a Big Data World. What is the use case?

I like the analogy in the IBM Big Data Platform book titled "Harness the Power of Big Data".

It tells the story of the days long gone, when miners could easily spot nuggets or veins of gold with the naked eye. This made investing easier, as its value could be seen and therefore the resources required to extract it considered against its perceived value. Using a Big Data analogy, we can consider this gold to be "high value-per-byte of data".

Assume for a moment that there is more gold nearby, but it is just no visible to the naked eye. Trying to find this gold is a bit more of a gamble and potentially more expensive. This would be "low value-per-byte of data" due to the challenges associated with finding gold not visible to the naked eye.

With the right equipment however, it might be possible to economically process lots of dirt and keep the flakes of gold found. This flakes can be taken for processing and combined to make flakes of gold.

Back to our Big Data analogy...

In this scenario, it would make sense to keep all the dirt we could find (in a Big Data System), so that as new, economical dirt processing techniques emerge (Big Data Analytics on commodity systems) we would have an opportunity to extract the flakes of gold (value / insights) and store it for processing into gold bars (in our Data Warehousing system).

Hadoop is a Big Data batch system that allows users to store all data in its native business object format and get value out of it through massive parallel processing on commodity components.

Data Warehouse is characterized by "speed-of-thought response times" requirements where sustainable data with  proven value stored and delivered interactively.

It is therefore clear to see that in a Big Data world, there is value and a place for both Hadoop (Big Data) and Data Warehouse systems.

IBM's Hadoop system is Infosphere Big Insights. For simplified Big Data Analytics, look no further than IBM PureData for Analytics powered by Netezza, and Infosphere Warehouse for your Data Warehousing needs.

1 comment:

  1. Understanding Speed and Scale Strategies for Big Data Grids and ... In parallel, it's transforming how businesses use data. .... Performance Acceleration for Your Oracle Data Warehouse. Greenplum MR. Big Data Use Cases ... of the Globe GigaPan allows anyone to create vivid 360° panoramas of the world. http //www.survival-warehouse.com

    ReplyDelete