Why a traditional data warehouse is no longer enough
A traditional data warehouse is no longer enough to organize data when today's standards apply. Due to the evolution of the Internet of Things, for example, there is a deluge of (potential) data available. These huge data sets are characterized by four dimensions, also known as the 4 V's of big data.
- Volume
The size of the datasets to be analyzed and processed has become much larger. The volume of data alone demands different processing technologies from the traditional storage and processing capabilities.
- Velocity
The speed with which information arrives has also increased enormously. With the arrival of 5G, we will continue to evolve towards real-time processing, because data can be collected so quickly.
- Variety
There are many sources that all provide different types of data. Previously, you had structured data arriving at the data warehouse. Currently there is also (semi-) unstructured data, making an adjustment in the architecture necessary. You may even make use of third-party data. By creating the right links between all the available data, a context is created, and more complex forms of value creation become possible.
- Veracity
This is about the quality of data. There is a need for greater attention to quality and correctness. Data must represent reality. If there is any perception or bias in the source data, you absorb this perception and believe it is the truth.
A data platform to suit your needs
A data platform that encompasses all these dimensions plays a central role in a data-driven organization. Many companies are working now with a very simple version, or have data in different silos. In order to exploit the full potential of data, it needs to be centralized. Other specifications for the platform should be determined on the basis of your needs.
The easiest solution is to choose a platform from a cloud provider. The most commonly heard "yes, but" arguments are about security, with topics such as privacy and GDPR. If such matters are indeed a factor, you can opt for a "yes, and" strategy in the form of a hybrid solution. For example, you can store the most sensitive data locally.
A split like this can also be financially attractive. After all, cloud-based computing power only costs money when you use it. A hybrid model like this can be a first step towards your own modern data platform.
It can also mean a split between storage and processing power. This makes it easy to implement far-reaching security measures. It is best to think about this split beforehand, in order to do it in the most efficient and effective way possible. Bear in mind that a platform is not an end in itself; you must be constantly focused on creating value through your data.