Faster-than-the-hare Proliferation of Technologies Spawns Data Silos and Creates Headaches for IT and Business Alike. What’s the panacea?
Technology innovation is great—it has helped drive down costs, reduced the size and complexity of gadgets, increased our productivity, and enhanced the flexibility of how we work. But have you noticed that this technology innovation is caught in a vicious vortex? As soon as a new technology comes along, it not only solves an existing problem but, in the process of doing so, creates a new set of headaches. Take big data for example—the ability to store, manage, and process huge volumes of data at lower costs and lightning speeds is a god-send. Earlier, only intelligence agencies that had a bottomless bucket, in the name of stopping terrorist attacks, could afford to buy farms of hardware enough to fill up a castle. Now, thanks to Hadoop, Spark, and other big data technologies, it has become economically feasible for commercial organizations to afford the same luxury! Then what is the down side? Big data has become a yet another siloed data repository leaving business users scrambling to manually integrate their data across big data systems, on-premise databases, and cloud applications. Sounds like the life of Bill Murray in the movie Groundhog Day!
Borrowing a Page from Server Virtualization
Remember the days when we used to build software that works across multiple operating systems like Windows, UNIX, Solaris, Mac, etc.? In order to port and test the code on all of these different operating systems, we used separate hardware—x86-based machines for Windows, Apple for Mac, Sun for Solaris, IBM mainframe for UNIX, etc. Then along came VMWare, who demonstrated how through server virtualization we can run the different operating systems in different virtual partitions, but all residing on the same hardware. Thanks to this technology, I can run Windows on my Mac!
Data today is like where different operating systems were 10 years ago—locked up in different silos across big data, cloud, and on-premise systems, and totally disenfranchised. Business users are burdened with reconciling the disparate data. IT has responded by manually integrating the data using ETL. But that takes time, usually in months! That does not make business happy. See the vortex?
Just like server virtualization, data virtualization liberates the business users from the fetters of data silos. It abstracts the data access across any number of sources, irrespective of the underlying data structures and formats. Best of all, it accomplishes this feat usually in days. That’s technology innovation!
Taking it a Step Further – Big Data Virtualization
But data virtualization is not new; rather, it is just becoming the trend with the evolution of big data. As companies use big data for cold storage and data lakes, data virtualization has become the go-to-technology to bridge the data from the big world into the small world, figuratively speaking. It’s all about making the business users productive, right? Actually, one of my customers, VHA, did exactly that. They adopted Hadoop for data lake, but the business users complained—they have to learn about Pig and Hive to access the data. VHA plans to use data virtualization to abstract Hadoop and enable their business users to use the familiar SQL to access the data.
In another use case, companies are using Hadoop for low-cost storage, while keeping hot data in data warehouses, and using data virtualization to bridge across the systems. Who is the best authority on this subject? Believe it or not, we actually wrote the (cook)book on this.
Drum Roll Please!
It’s for reasons like these that we won, not one but two, awards. Last month, we were recognized by Database Trends and Applications as one of the Big Data 50 Companies Driving Innovation. Last week, we won the 2015 Ventana Research Technology Innovation Award for Information Management. How cool is that! In fact, Ventana will walkthrough why they selected us in a webinar on Friday the 16th of October at 10:30am Pacific. Register for the presentation webinar here, to find out why we were chosen.
- The Data Lakehouse Myth - February 22, 2023
- Data Virtualization and Data Science - July 1, 2021
- Key Insights from Three Cloud Experts Roundtables: Accelerate Hybrid Cloud Journey, Harness Cloud Best Practices, and Simplify Data Management - September 30, 2020