Managing Information in a Data-Driven Company with DMPs
Data is the oil of the future, but like oil, it needs refining before it can be used. To enhance a company’s competitiveness, an information system is necessary to clearly separate the world of data from the world of applications.
Data is everywhere, and there is no single way to treat it: some must be analyzed in real-time, while others can be analyzed more leisurely. Think, for example, of an alarm system or Google’s web page indexing system.
Often, data needs to be cleaned, transformed, and loaded into intermediate storage to be analyzed. A data warehouse alone is no longer sufficient. Today, it is necessary to use data lakes to handle structured or unstructured data from various sources and integrate open data. When dealing with vast amounts of data, aggregation is necessary using big data technologies such as those provided by Apache Hadoop.
Even more often, thanks to new machine learning algorithms, the relationships between data become so complex and profound that they require organization into a knowledge base, for example, using a knowledge graph. The knowledge obtained in this way can also be navigated by artificial intelligence algorithms based on neural networks or logical rules (Neuro-symbolic A.I.).
A Data Management Platform (DMP) is precisely the set of tools that allow cleaning, federating, integrating, and analyzing data from many sources, both internal and external to the company. Thanks to DMPs, applications access all available data in a unified approach, overcoming the limitations imposed by architectures where data is placed in non-communicating silos.
The platform becomes smart when it can understand the meaning of the data. Only then is it possible to connect heterogeneous data while ensuring the orderly growth of information to support decisions. In the following figure, typical components of a Smart Data Management Platform are highlighted in orange:
The meaning of data also changes over time; fortunately, semantic technologies developed in the last decade within the Semantic Web have proven capable of supporting even the most radical changes in business models.
Data with formal meaning has earned the qualification of smart in the field and is now mature for use in businesses, as Google, Facebook, IBM, governments, and many startups already do.
The major cloud platforms (Amazon AWS, Google Cloud, Microsoft Azure, etc.) offer a rich set of tools as a Service to manage big data, data lakes, and many other components of the data management platform. LinkedData.Center specializes in tools for ingesting meanings and interfacing with smart data through its product Smart Data as a Service (SDaaS).
An example of a completely “as a Service” implementation based on tools available on the Amazon AWS and LinkedData.Center platforms is described in the following figure:
With a smart data management platform, it is possible to analyze the tastes and behaviors of customers to offer them the best product, identify new business areas, new trends, optimize costs, and anything else the entrepreneur’s imagination suggests.
The only essential element is data, whether it be company-owned (first-party data), from partners (second-party data), or from sources external to the company (third-party data), such as open data provided free of charge by governments or paid data provided by data providers.
A good smart data management platform is able to use them all, even in the presence of inconsistencies and different quality among the processed data.