Sunday, March 25, 2012

OLAP-OLTP-InMemory Convergence

Many vendors had started designing and selling the concept of OLAP(On-Line Analytical Processing) systems in the late 90's to provide better reporting of the interpretation of the data. The OLAP tools stored per-aggregated results unlike the pure database or the OLTP (On-Line Transacational Processing) systems.Tools like Hyperion Essbase, Cognos PowerPlay, were some of the prominent players in the market during the 90s.  In the late 2000's SAP introduced SAP Business Warehouse to compete in the OLAP and Datawarehousing space.

There were 3 type of OLAP systems
  1. Multi-dimensional OLAP(M-OLAP): These systems had real multidimensional cubes defined within them. eg: Cognos Powerplay, Hyperion Essbase, Microsoft SSAS, SAS/MDDB, Oracle Express..etc
  2.  Relational OLAP(R-OLAP): These systems were mainly modeled on relational star schema. eg: SAP BW, Microstrategy etc
  3. Hybrid-OLAP(H-OLAP): Most of the M-OLAP systems support this configuration. This is done by using relational star schema's for detailed level and multidimensional for the aggregation representation

OLAP tools were highly successful since it was an easy way to implement the business requirement with better results unlike the complex SQL query written on a relational database system. It was mandatory to define the KPI's and dimensions for modelling a cube and the models were not standardized then. Over the years, the data started growing in the OLAP systems and there was a constant need to improve the performance of such systems. SAP has introduced an boosted appliance hardware called BIA (Business Intelligence Accelerator) to improve the OLAP processing over the SAP BW system.

However, with the recent explosion of data in Terabytes and Petabytes, most of the above mentioned systems cannot scale to process the data and arrive at the business expected solutions. This has triggered few of the leaders in this space to work on in-memory databases which would be a true convergence of the OLAP and the OLTP system.

SAP has introduced the SAP HANA in 2011 for which Oracle tried to answer with Exadata. However, Exadata could not meet the bench marks provided by HANA. This has triggered Oracle to resurrect an acquisition product named "TimesTen" as an answer to the SAP's HANA solution. According to SAP, they would actually move all the processing capability from the application layer in SAP BW to the in-memory database layer like the conventional DB since the DB can process faster than the application layer. Many of the SAP customers were very much confused by this move from SAP especially considering that SAP did lot of marketing for the BW application level processing capabilities and the customers spend a lot of time and energy writing those models on the BW application layer.

On the SAP business side, It seems SAP has acquired nearly 200 customers with an average deal size of USD 600-800K for every license of which 10-20 customers have gone on production in the last 8 months.

Monday, February 20, 2012

Future of Data Integration- Roadmap

Some one said that data in a data warehouse is like an old cloth in your closet, some time you might not pick it for the next two years. Many of us  have at least 1TB of pocket hard drive that contains lot of data including musics, photos and videos. How often have you deleted your old data in your personal hard drive or throw away an old data CD/DVD. I believe most of us never throw the data considering that the old data would be useful for us some day. How often has the 'someday' come to you, may be never till now.

Similar is the case with your business. In many cases, you are forced to keep the historical data due to legal compliance and you never use it. What would you do if you want to use it, how organized is your data for searching, filtering, sorting and presenting it? During the last few months, I was looking into the new demands that were being considered in the Data Integration space and how can it be projected for this decade. The key spaces that needs a good data integration system to take care has been observed as follows:
  • Big Data
  • Social Media
  • Mobile integration
  • Zero downtime
  • Demand based real time data synchronization
  • Cloud data integration
What does this mean, there is new and more data that is being produced for various business analytical applications to help the business managers to decide or run an automation system. There has been a severe competition in this space in the IT products and a lot of consolidation has happened during the 2000 decade. Many of the BI and analytics companies have been bought out by software giants and have been building mammoth products to help customers to help with business decision. One of the most interesting advancement in this space was the in-memory based database launched by SAP AG called "HANA". The in-memory systems are supposed to work hundred times faster than the conventional relational databases. The need for such a system was that many of the SAP based reports used to run for hours as SAP calls "lunch time processing".

In short, the data is being captured from every event, action unlike the earlier manual system where most of the data used to be on the books. The volume of data is exploding everywhere and the time to process is reducing (actually the time is constant but the requirement for processing time is low). The data needs needs to be synchronized and made available to all the decision support systems in real time which means real-time data streaming is required to meet the business demand. The data could be residing on the cloud systems such as Salesforce.com or Microsoft Dynamics CRM cloud based systems which are hosted outside of the enterprise. For this to be efficient, this means that the data filtering and interchange should be minimal so that the data interchange time is used most efficiently. The demand for the enterprise IT integration software to select, filter, sort, compress, encrypt and process the data is having new challenges in this era. In simple terms, the integration software is expected to perform at a higher scale that it used to perform earlier.

Thursday, January 19, 2012

Technology Trends for 2012

I too believe that the  top three things for 2012 is going to be the following:
  • Big Data
  • Social Media 
  • Agile and Pervasive BI solution
Among the top 3, I believe the last one (Agile and pervasive BI solution) has the biggest market share in the top 3 items. Some of the revolutions would probably happen combined with the cloud solutions.

HANA has been the magical word from SAP for 2011. However, here is one of the quotes from the Redbull implementation "Redbull did not see a big improvement in query performance because they were already using BWA.  Moreover, they found that HANA does not have all the features that BWA gave them. But they think this is ok, because SAP is committed to working with them to put in features in future revisions of HANA." I did not find this comment quite impressive after all with the in-memory processing (Ref: http://andvijaysays.wordpress.com/2011/11/09/redbull-migrates-bw-to-hana-i-am-suitably-impressed/)


Read more from the analysts here:
No SQL Databases
Business Intelligence 2012
Data Virtualization