Friday, September 27, 2013

Databases in the 21st Century: Can the CIO dictate for a single database within an enterprise?

Can the CIO dictate and normalize only one single Database vendor within an organization?

Historically, Database was a standard method for storing data in row format in a predefined manner. Many of the enterprises would have been either associated with IBM DB2/UDBC or Sybase or Informix or Oracle or Microsoft SQLServer for providing their databases requirement along with the other software that is required by them.  Those days, the requirement for Databases were very simple, and the basic expectation was to store the data and provide business backup for a 2-tier or a 3-tier application for either recording transactions or for reporting.

During our college days, most of us would have learned through the lineage of DBMS, RDBMS and OODBMS and the other databases for real large volume of data called the datawarehouses. We have been having some other DBMS such as Document and NoSQL databases. Very recently, some of us would have heard about the Cloud Databases, columnar databases, device databases and in-memory databases. So where does that leave us? Can we still depend only on one database vendor for the enterprise- say IBM or Microsoft or SAP(merger with Sybase) or Oracle ?

The high level classification of NoSQL DB are as follows:

Data Model Performance Scalability Flexibility Complexity Functionality
Relational Database variable variable low moderate relational algebra.
Key–value Stores high high high none variable (none)
Graph Database variable variable high high graph theory
Document Store high variable (high) high low variable (low)
Column Store high high moderate low minimal

For the list of  all the various databases that are currently available on various hardware's and systems from various vendors, you can visit http://en.wikipedia.org/wiki/NoSQL


The answer is a simple 'NO'. The reason is due to the fact that the IT systems have expanded in the 21st century and have created many more use-case scenarios for the databases, data-warehouses and data storage optimization

Reference
http://www.computerworld.com/s/article/9246543/IBM_buys_NoSQL_cloud_provider_Cloudant

Data connectivity to third party applications

I often get this query from my sales: "How can I connect to this xyz application" which falls into the long-tail of connectivity problem. There are multiple ways of connecting to third party applications which may fall into any of the following categories:


  1.  Standard 2-tier or 3-tier application: Majority of the custom business application deployed at an enterprise customer site would have a database behind the application. Typically, connecting directly to the database using either our native or ODBC drivers would be the easiest data  integration point for such a custom application which does not expose any other standard application interfaces.
  2.      Cloud hosted application: Many of the new cloud based application vendors provide the standard Web-Services/REST based interface for connecting to the application
  3. On-premise or cloud application exposing CLI’s to integrate: Use standard CLI functions that are exposed by the applications and write a custom program to integrate with the flat files generated as the output of the CLI 
  4. No DB Connection, No Webservices or CLI interface - Exposed through programming API: If none of the above is possible and the third party applications needs to build an exclusive connection using a programming API(C,C++,Java) to the connecting applications.
In addition to the base connectivity decision areas as mentioned above, we would need to look at the customer’s data integration use cases for connecting into each of the following areas before defining the appropriate connectivity solution.
      a)      Volume of data & Scalability (partitioning, Bulk interfaces, CDC etc)
      b)      Velocity (performance-bulk or real time)
      c)       Security(authentication, authorizations, data staging concerns etc)
      d)      Variety(mapping application data types to your data types or transforming the specialized encoded data such as JSON, EDIFACT)  
       e)      Validity (history snapshots or real time)