3 phases of data analysis

The 3 Phases of Data Analysis: Raw Data, Information and Knowledge

Business competition is fiercer than ever, especially in the digital space. The only way to differentiate your business is by adding value through data analysis to better understand customers and adapt strategy for rapid success. This entry reviews the 3 phases of Data Analysis needed for success in your business.

It is clear that companies that leverage their data, systematically outperform those that don’t. For this reason, it is critical to process raw data and extract the most relevant information for your business. So, let’s review these 3 phases of Data Analysis:

3 Phases of Data Analysis: Raw Data

Raw data is any data that is relevant and interesting for your business. For example, raw data can be a sales report from a recently launched product or all mentions of a product on social networks, forums or web reviews.

In the past, raw data was mainly stored in a company’s data warehouse; however, this method is no longer optimal because it doesn’t take into account external information (forums, social media or PR) and limits your company to internal resources.

Raw data also resides in other places, such as your own operational systems like CRM or ERP and it also exists in Big Data repositories (mainly crowded with unstructured data), social media, and even Open Data sources. As a result, it is very important to identify all of this data and connect to it, no matters where it is located.

3 Phases of Data Analysis: Information

Once you have the raw data at home, it’s time to analyze it. This is when you separate the wheat from the chaff, creating a repository with key data affecting your business. However, don’t start making any decisions just yet – you’re not finished. In this phase you enrich the data; it becomes contextualized, categorized, calculated, corrected and simplified, and this is why we say that this phase transforms raw data into information.
Building on the example from above, we can now sort the sales report by region, and we can split all of the social network comments by sentiment, such as “neutral”, “positive” and “negative”, and classify this information by region, as well.

3 Phases of Data Analysis: Knowledge

The last phase of Data Analysis is knowledge, which makes the gathered information sensible. This phase includes more complex tasks, like comparing elements and identifying connections and patterns between them. This is where you prepare the information to help you start making decisions.

To further build on our example, in this phase, we can analyze all of the regions’ performance and combine all of the sales information and local social network comments from users. At this point, we are able to identify critical issues, such as the number of negative comments in California or an unusually low number of comments in Florida. Thus, when we share this information with the decision makers, they will discover that we have a local competitor in California, so we better create a specific strategy there, and that we didn’t do enough marketing in Florida, so there are many people that don’t know about our product.

Data Virtualization: 3 Steps to Success

In order to be successful in the 3 phases of Data Analysis, you will need a platform that extracts knowledge from raw data, and this is where data virtualization comes in.

Data virtualization provides 3 simple steps to sort and organize your data: connect, combine and publish. In essence, data virtualization provides an abstraction layer that allows you to connect to disparate data sources, collect data, filter it, create a canonical view containing only what is relevant for your business (information) and add value by transforming it into knowledge. After this, data virtualization allows you to provide that information to the decision makers within your organization so that they can drive the business accordingly.

You can get more information about data virtualization and how it works from this interactive diagram from Denodo.

Daniel Comino


  • Data visualization is a major component of a successful business intelligence platform. Numbers and data points alone can be difficult to decipher. Having a visualization of the data helps to form better decisions, and also reduces the risk of missing out on important data as visualization “paints a picture” of the data as a whole.

  • Exactly Pat, totally agree with you. Thanks for your recommendation. The main idea behind my entry is that BI users need to play with the Big Data information fast, and working with BI tools today is very complex because it requires the support of many people with specific skillsets. It also forces you to replicate data within the different required steps. All of this ends up in a rigid schema where any change, update or new report requires a lot of effort to create and adapt.

    On the other hand, if you have a data prep stragety, such as a virtual data layer which is provided by a data virtualization tool, you can easily change your views to create new reports in hours instead days or weeks. Thus, in this case, data virtualization provides you with flexibility, dynamism and faster time to market. However, I agree with you that final data visualization is also very important. In fact, the Denodo Data Virtualization Platform allows the user to easily navigate through the data, by simply following web links, jumping from a business entity to another via a single click, giving visualization tools a nice representation and navigation over the data.

  • Data Analysis supports the organizations’ obtain insight into how much improvement or regression their performance is manifesting. It also helps in a more immeasurable perception of the customer’s needs and specifications.

Leave a Reply

Your email address will not be published. Required fields are marked *