Data Virtualization and the Cloud
Cloud computing is a fact of life. Organizations like the agility, flexibility, and scalability that it provides. They can scale up and scale down their data storage and compute capacity as and when needed. This provides incredible flexibility to new or dynamic workloads and takes advantage of new data types or new business opportunities without making huge commitments on infrastructure. Need a 200 node Hadoop cluster for analytics on some new data? Sure, no problem. Need some extra capacity for end of year processing? Here you go…
It’s no wonder that the center of gravity for both data and compute capacity is moving from the traditional on-premises data center to the Cloud as companies take advantage of the flexibility inherent in the Cloud. However – and there is always a “however” when something seems this good – the applications and data sources that move to the Cloud are still a part of a larger ecosystem and still need to be integrated with other systems and processes. We can’t create isolated, albeit incredibly agile and flexible, data and application silos – that is a retrograde step. Moving your analytics data and processing to the Cloud (e.g. Hadoop on Amazon AWS or Azure) runs the risk of creating a ‘data science’ silo, which doesn’t maximize its value to the organization. The data and applications that are migrated to the Cloud must be integrated into a larger data ecosystem – which is typically both on-premises and in the Cloud – and this integration has to be a primary component of any Cloud modernization and migration strategy.
A data integration platform that can straddle the divide
Because an organization’s data and applications will straddle the Cloud, the Edge, and on-premises Data Center divide, you need a data integration platform that will similarly straddle the divide. That is, a data integration platform that will support a hybrid deployment of on-premises and in the Cloud. Additionally, this platform needs to be intelligent so that it can move the processing to where the data resides and not the other way around. It must move the processing to the data – the days of copying masses of data from one system to another for processing are long gone. The volume of data involved in modern analytics is too large for this old style of thinking – and moving data in and out of the Cloud for processing just exacerbates this problem. The data integration platform – in its hybrid configuration – needs the on-premises and in the Cloud parts for tandem processing of the data where it resides.
This is where Denodo’s Data Virtualization Platform is in a class of its own. Obviously, it can be deployed on-premises – in the data center – or at the edge, but it can also be deployed in the Cloud to create a hybrid data integration platform which is intelligent enough to push the processing as close as possible to the data that is being integrated, and only move the minimum amount of data back to the consuming application or tool. Denodo’s unique capabilities make it ideal for both hybrid and Cloud deployments where moving large volumes of data can accumulate significant Cloud egress costs.
The Denodo Platform is available via subscription in the Amazon AWS Marketplace – with availability in the Azure Marketplace coming very soon. The subscription model (hourly or annual) gives users the flexibility to scale up and down as necessary in line with their data sources. You can try out the Denodo Platform for AWS with a 30-day free trial…it’s worth it to experience the flexibility and agility that it can add to your data integration infrastructure.
(Note that you can also use the Denodo Platform on any Cloud infrastructure with a ‘bring your own license’ (BYOL) model.)
- Data Virtualization: The Key to a Successful Data Lakes - March 11, 2021
- An open data portal for the analysis of COVID-19 data - May 19, 2020
- Data Virtualization: The Key to Weaving Your Big Data Fabric - June 26, 2018