Cloud Migration, Data Services, and Effective Data Virtualization
A plan for migrating organizational data to the cloud is often motivated by three key factors:
- Economics: In many cases, the costs of “renting” cloud storage and computing power is much less than the cost of purchasing (and then depreciating) systems in a recurring 3-year cycle. In addition to the tax benefits of transforming capital expenditures to operating expenses, cloud service companies provide virtualized resources whose performance tracks with the most current levels of technologies.
- Reduced operational burden: Relying on cloud resources implies that the cloud provider will manage the operations and upkeep of the platforms, reducing your in-house operational burden.
- Broad scope of capabilities: Cloud host providers are rapidly expanding their palette of services beyond core capabilities such as computing power and storage to include database management, data streaming, data integration, data warehousing, analytics, among many others tools and technologies.
Cloud migration offers another valuable capability that may not always be apparent from the get-go: with proper planning and implementation, it enables organizations to identify critical enterprise information resources and make them available for sharing and reuse.
Let’s consider a simple example: different business functions at a retail company manage their own copies of customer data. The marketing department maintains a prospect database, the sales group tracks existing customers and their purchases, the finance department oversees customer credit accounts, and the fulfillment department oversees the packaging and delivery of product to the customers’ doorsteps. While each group has its own reasons for maintaining its copy of the customer data, the entire organization would benefit if all that information could be shared across each of those groups. A plan for customer data integration and migration of that consolidated customer data set to the cloud would enable all of the groups to produce and consume customer data from the same, presumably synchronized resource.
Conceptually this sounds simple. But to effect this transition to a unified customer view managed in the cloud, you need to accomplish three things.
- First, you need a unified information architecture for managing the unified view of customer data.
- Second, you need to have a program for customer data integration to populate your unified information architecture.
- Third, you need a framework for making the customer data available and accessible to the different downstream consumers.
None of these items is simple, but I am going to suggest that there are ways to evolve the cloud-based repository to incrementally enable a customer repository that makes shared customer data available for reuse.
This approach is driven by a data consumer perspective that considers what information would be valuable to those consumers and how the data can be properly packaged and presented to those users. This suggests that, contrary to your first intuition to focus on the data consolidation, the first step is to develop a data services architecture that defines application programming interfaces (APIs) for data consumers to make requests for data access as well as directives for creating new customer records and updating existing records. In turn, develop data services that implement the APIs and simplify data consumption.
Once you think about it, the reason for doing this becomes obvious: facilitating access to what appears to be a single shared repository establishes the benefits of data sharing and reuse and will inspire increased demand and consequently, increased investment in the longer term strategy of customer data integration.
This is where the concept of data services dovetails with data virtualization. In essence, any data service “virtualizes” access to data, as it creates a standardized interface as a buffer to data access. However, using data virtualization technology allows you to temporarily bypass the data integration step while providing the perception of delivering consolidated customer data. You can select the collections of customer data sets from their corresponding sources and move them to a cloud storage environment. The data virtualization technology can be configured to transparently transform API/service requests into the corresponding access requests, invoke those requests, accumulate the partial result sets from the different source data sets, and render a packaged combined result set using the API’s data model.
Data virtualization provides a layer of abstraction that effectively enables data consumer requests to be serviced. And by maintaining a consistent set of service interfaces while the customer data integration and consolidation effort is underway, this allows the development team to employ agile development methodology to continuously improve the shared cloud-based customer data repository without significant distraction to the users.
Latest posts by David Loshin (see all)
- Integrating Data Quality into the Data Flow - August 24, 2018
- Cloud Migration, Data Services, and Effective Data Virtualization - May 16, 2018