Data Virtualization is a Revenue Generator
Many organizations have made it a high priority to implement a data virtualization solution. This makes sense, since traditional data warehouses can no longer keep pace with the explosive growth in data volumes and user queries. The logical data warehouse (LDW), which uses data virtualization to minimize data duplication to the greatest possible extent, is the most obvious choice; by delivering data in a virtual way, instead of in a physical way, companies can access new information much more quickly while being able to anticipate and immediately respond to changes. Then and only then can we truly be talking about “agile BI.”
A Justifiable Question
Data virtualization offerings like the Denodo Platform offer the possibility to integrate data sources into one uniform information model without having to duplicate data, independent of the location or format. The possibilities that a platform like this offers in terms of connectivity, ease-of-access, and performance optimization enable companies to migrate to a logical data warehouse architecture with exceptional speed. In actual practice, data virtualization often exceeds expectations, but many organizations struggle with investing in any new software for data integration. Such organizations usually ask me the same question:
How can I justify the investment?
In this post, I’ll explain how to put together a business case in four steps. This not only provides a framework for comparing data virtualization alternatives, but it enables you to demonstrate, rather soon, the revenue that your organization stands to gain by implementing data virtualization.
A Business Case in Four Steps
Your case should demonstrate that the revenue will exceed the costs, and to do that, you must compare all applicable costs against the expected revenue over time. You will have to answer these four questions for each considered solution:
- What costs will we incur by implementing the new platform?
- What costs are incurred by the current process?
- What are the potential savings of the new platform, compared to the current process?
- What opportunities does the new platform offer, and of what value are they?
Step 1 – Implementation Costs
In this step, determine the costs of implementing the data virtualization platform and keeping it running. These costs will cover licenses, infrastructure, and personnel.
Licensing fees. The amount and type of licensing fees differ for each product and often depend on the number of servers (OTAP, production clusters, etc.) and CPU cores. Correctly calculate which licenses are required for your situation, beforehand. A proof-of-concept (PoC) based on realistic use cases may help with this.
Costs related to infrastructure. Take all servers into account that must be acquired for the data virtualization platform, in addition to those in your existing infrastructure. Also remember to include database servers that might be required by the platform for data caching.
Personnel costs. The costs for personnel to set up the platform depend markedly on the complexity of the platform, the available internal knowledge, and the required external expertise. Best practices are available for many platforms, to give you a proper indication of the required effort.
Personnel costs vary widely between the various data virtualization products. Comprehensive and mature platforms with ample, standard functionality, like the Denodo Platform, might be more expensive to acquire, but they definitely pay for themselves in terms of the simplicity of installation and configuration. Experience with ETL and SQL is sufficient in getting started with such a platform.
Step 2 – Costs Associated with the Current Process
In this step, identify the costs for the processes that you currently use to unlock data, which you will aim to replace with data virtualization. An obvious example is the traditional BI process, and the chain that leads from the source system to the report. However, you can also analyze the supply of information to external parties. What does it cost to manage your processes and implement changes?
It is important to gain insight into the total cost of ownership (TCO) for the entire process, from analysis, development, testing, and acceptance to deployment and management. This is an essential step, in which you perform a baseline measurement. This can be a shock, because you suddenly have insight into the costs that you currently incur to unlock information. However, this is an essential step in determining where you may be able to save some cost.
Step 3 – The Savings Achieved with Data Virtualization
This is where it gets interesting: What will your considered data virtualization platform actually provide to you? To find out, you’ll need to know how much more efficient you’ll be working with the new platform. But how do you do this?
First, you must think in detail about what the future work processes will look like when you apply data virtualization. Which data streams have the potential to be delivered in real time? Also, it would be an illusion to think that you could soon replace everything with data virtualization. You must therefore get a clear picture of the near future.
- What percentage of the data streams can be virtualized?
- Of those streams that can be virtualized, what is the potential savings in terms of TCO?
- Of those streams that cannot be virtualized, are there sub-steps in some of the processes where data virtualization can, in fact, provide assistance and added speed? And if so, what does this mean for the savings on these streams in terms of TCO?
Once you’ve conducted this analysis, you’ll have a nuanced picture of the savings that can be afforded through each considered data virtualization platform. This applies in terms of analysis and development, but also in terms of maintenance, management, testing, and acceptance. The savings can be substantial, since the process of unlocking data through data virtualization is a much shorter and simpler one than in a traditional architecture where data is duplicated several times.
The best way to conduct the analysis is to draw the processes and pinpoint the savings per process step. Think realistically about how you plan on working with the relevant platform, which functionality you’ll be using, and what knowledge you’ll need for it. Only then will you be able to properly estimate the actual savings that the platform will afford you. Employ best practices or inquire about the historical figures of companies that are already working with the data virtualization platform. A PoC is a very good way to substantiate and verify the results of your analysis.
If you do it correctly, you should notice major differences in the possible savings between platforms. The more functionality a platform offers for your specific situation, the greater the savings on your current processes. In my opinion, usability also plays a major role here, since it obviates the need for specialized knowledge, which is costly. The more accessible the platform and the broader the functionality, the more you stand to save.
4 – New Opportunities with Data Virtualization
Often, a business case already comes out positive after simply looking at the potential savings. That is great, but it is obviously at least as interesting to look at new opportunities resulting from data virtualization. These are opportunities that you can’t capitalize on right now, but ones that your organization could benefit from if you had more information available, faster. Why look into these? Simply because some data sources (think big data), which you could not analyze in the past, will soon become accessible. Or because now you can quickly link external information to your own data. Or it might be because you can introduce changes to your data warehouse within a matter of hours instead of weeks or months.
This is where the greatest benefits can be reaped from data virtualization. If it becomes much simpler to unlock new information fast, perhaps in real time, it leads to an immediate competitive advantage or even to entirely new business models. We can’t rely on a ready-to-use recipe for this, because the opportunities differ per company. My only advice is to always draw up the business case with representatives from the business. Take the time to properly inform people about the data virtualization concept and to think creatively about the potential opportunities. Explore all the opportunities and think outside the box! Think about what an opportunity can mean for your organization and quantify it in monetary terms. Once you’ve done that, you’ll have all the building blocks for your business case.
Pros and Cons
The thinking work is done, and now it’s just a matter of adding and subtracting. The first version of your business case is ready.
In the example below, I compared the investment for the implementation of a ‘medium-scope’ data virtualization platform (Denodo, in this case) to a conservative estimate of the costs for a traditional data warehouse. Converted to a multiannual ROI overview, you’ll see that – even with a conservative estimate – the investment has already been recovered in the second year. These are the results, even though a mere 25% of the savings and revenue are added for the first year.
This example is not a rare exception. Actual practice shows that organizations often recover the investments in data virtualization over a very short period.
That’s Only the Start
You’ve now drawn up the first version of your business case, or perhaps multiple versions for different alternatives. This gives you the information to choose one or more platforms for further investigation in a PoC. In this PoC, you can verify and substantiate the assumptions from your business case. You can supplement and fine-tune the business case based on the results of the PoC. This will give you all the required information to make a well-considered decision to invest in data virtualization.
Also, it doesn’t stop after this decision. The business case gives you a tool with which to continue to monitor and adjust the process – after implementation – so you can capitalize on the intended savings and revenue in actual practice.
This blog was penned by Mattijs Voet, Senior Consultant, Kadenza.
Latest posts by Kadenza (see all)
- Data Virtualization is a Revenue Generator - September 20, 2017
- Data Delivery: Four Challenges, One Solution - February 1, 2017
- Shifting Towards a Logical Data Warehouse, in 4 Steps, Without Having to Close Shop - December 13, 2016