How Green Is Your Data?
Reading Time: 2 minutes

It seems like a strange question, but consider how much energy is consumed by blockchain, and in particular, Bitcoin. According to recent reports by the Cambridge Center for Alternative Finance, Bitcoin consumed about 0.55% of the world’s electricity production in 2021. That’s approximately the same amount of electricity as Malaysia consumed. This is also seven times more energy than Google consumed in 2021.

IDC stated in its Revelations in the Global DataSphere report, July 2021, that “Only 10% of the Global DataSphere is unique in any given year, the rest is replicated/consumed.” A Carnegie Mellon University study concluded that the energy cost of data transfer and storage is about 7 kWh per gigabyte in the cloud, which in itself can be significantly less than that of a traditional on-premises data center.

The Quandary

But what can we do about this? After all, data is valuable, and being able to access, interpret it, and measure it, is essential to all organizations; we can’t just delete it.

Also, we are on an exponential curve in terms of data volumes, and much of it is unstructured and thus harder to interpret. Another IDC statistic is that between 2020 and 2025 there will be a 400% increase in worldwide data volume (from 44 zettabytes to 175 zettabytes).

Clearly, we need to better address how and where data should be stored (or not) to avoid unnecessary copies.

A Logical Solution

Data virtualization is an ideal technology to assist here. The aim of data virtualization is quite literally to reduce/eliminate the replication of data and its cost. It also aims to optimize query execution to reduce the compute requirements. That might not sound important from a sustainability viewpoint, however optimized queries consume less compute, which, in turn, means that you are consuming less energy.

A logical data fabric, powered by data virtualization, also helps you to organize your data in less energy-consuming ways. One of the key features of a logical data fabric is query acceleration. Query acceleration can help in optimizing the ingress and egress of data as well as optimizing the required compute cycles, for two main benefits. The first is about a higher speed of execution, making sure the results are returned in near real time. The second, and less obvious one, is about a reduction in energy, by optimizing the query execution, which in turn optimizes the energy costs associated with data transfer.

The Way Forward

My recommendation would be to review your data and understand how much of it is unique. How much is replicated? Where is your data stored? Can you store it in a more environmentally friendly manner?  If you leverage a more sustainable solution by adopting data virtualization, you can not only save energy, but do so while improving the power of your data as well as the bottom line.