From traditional data warehouses to modern data architecture, here's what you really need to know

From traditional data warehouses to modern data architecture, here's what you really need to know

20 April 2021

Data

Every (about to be) data-driven organization needs a modern data platform. A company like this wants to do more with its data than just create efficient reporting. Data should also become the heart of the organization from which value can be created, a more efficient organization ensured, and even new services can be offered to customers. In addition to a streamlined culture and processes, a smartly designed technology platform is also needed to facilitate everything.

Why a traditional data warehouse is no longer enough

A traditional data warehouse is no longer enough to organize data when today's standards apply. Due to the evolution of the Internet of Things, for example, there is a deluge of (potential) data available. These huge data sets are characterized by four dimensions, also known as the 4 V's of big data.

  • Volume
    The size of the datasets to be analyzed and processed has become much larger. The volume of data alone demands different processing technologies from the traditional storage and processing capabilities.

     
  • Velocity
    The speed with which information arrives has also increased enormously. With the arrival of 5G, we will continue to evolve towards real-time processing, because data can be collected so quickly.
     
  • Variety
    There are many sources that all provide different types of data. Previously, you had structured data arriving at the data warehouse. Currently there is also (semi-) unstructured data, making an adjustment in the architecture necessary. You may even make use of third-party data. By creating the right links between all the available data, a context is created, and more complex forms of value creation become possible.

     
  • Veracity
    This is about the quality of data. There is a need for greater attention to quality and correctness. Data must represent reality. If there is any perception or bias in the source data, you absorb this perception and believe it is the truth.

 

A data platform to suit your needs

A data platform that encompasses all these dimensions plays a central role in a data-driven organization. Many companies are working now with a very simple version, or have data in different silos. In order to exploit the full potential of data, it  needs to be centralized. Other specifications for the platform should be determined on the basis of your needs.

The easiest solution is to choose a platform from a cloud provider. The most commonly heard "yes, but" arguments are about security, with topics such as privacy and GDPR. If such matters are indeed a factor, you can opt for a "yes, and" strategy in the form of a hybrid solution. For example, you can store the most sensitive data locally.

A split like this can also be financially attractive. After all, cloud-based computing power only costs money when you use it. A hybrid model like this can be a first step towards your own modern data platform.

It can also mean a split between storage and processing power. This makes it easy to implement far-reaching security measures. It is best to think about this split beforehand, in order to do it in the most efficient and effective way possible. Bear in mind that a platform is not an end in itself; you must be constantly focused on creating value through your data.

data-blog-3

Here is what you need to think about when setting up a modern data architecture

There are four phases through which data moves in a data-driven organization. Data is collected and stored within the architecture, processed, consumed and used. Each stage involves different elements to keep everything moving in the right direction. Everything needs to be considered carefully. What exactly do you need, what will it be used for, what should it enable, etc.?

Firstly, it is important for the data to be stored in a usable form. The groundwork must already have been done so that a data engineer can use their time effectively. Beware of having too many processes or correlations; the data must be usable across the board. A strategy also needs to be drawn up to deal with data security. When classifying datasets, you determine which security measures are required at each stage of the platform. Assign data a sensitivity category, and decide which items can be shared. In the area of data security, confidence is very quickly lost, and difficult to recover. It is also important to know the origin of data, in order to remain compliant with the GDPR. A data catalog will also support you here. This is where you keep track of what data exists within the company. When someone asks for their data to be deleted, you can immediately see everything that you have about them. There are many other components that are important for good data quality and governance. Also build in the ability to analyze data. Make sure that there is a way to browse the data to search for possibilities.

Be sure to think about what you ultimately want to do with the data. Do you want to build a dashboard to monitor what is happening in a specific area of your organization and/or do you want to set up predictive or intelligent models? In addition to building such visualizations or models, the data can also be integrated into existing business applications or even made available to third parties.

Ultimately, the data can be used in all kinds of applications, after it has come a long way. Nevertheless, this remains the starting point for many companies ("we want to do something with our data"), and they forget about the rest of the puzzle, and therefore don't get the results or the value they had in mind. Building a report or AI model is not enough on its own. The insights must be used and deployed to create added value in the organization. Unless you let the data flow through to the heart of your organization, you are left with interesting but separate models, with no impact.

Do you know yet how to start becoming data-driven?

This is the third blog in a series about anchoring data at the heart of your organization. Be sure to read the other blogs too.

Discover all our blogs
Read more

Subscribe and receive our blogs in your mailbox

Sign up for our newsletter

Would you like to receive our newsletter and stay informed about your preferred topics? 

Sign up here