The results of the recent poll conducted on LinkedIn motivated me to write this article and share my practical experience and observations.
The poll’s question was, “What has been your company’s first data management initiative?” 130 data management (DM) professionals from different countries shared their experiences. Figure 1 demonstrates the results.
Figure 1: Choices in prioritizing data management initiatives.
You can see that almost 50% of companies identified data governance as the first capability to implement. Data quality and data analytics follow. Metadata management closes the list. Unfortunately, LinkedIn has a limited poll capability, so I could not include other capabilities like data and information architecture, security, etc.
In any case, these results help demonstrate the one principle not all companies realize.
Regardless of the chosen title or priority, all companies should implement the same core DM capabilities as they have strong dependencies with each other.
In this article, I will:
Explain this principle using the capabilities taken into the poll as examples
Demonstrate the key relationships between core data management (DM) capabilities
Metamodel of data management
Leading industry guidelines have different viewpoints on data management and relationships between its components.
The reality is that many companies choose one DM capability and start its implementation without realizing its dependencies with other models.
I use the “Orange” data management framework (DMF) in my practice. I present data management as a set of capabilities that play different roles in delivering the data management business value: ensuring a data (assets) lifecycle.
Figure 2 demonstrates this model. For this article, I linked the “Orange” model with the DAMA-DMBOK2 model to make it easier to understand.
Figure 2: The DAMA-DMBOK2 Wheel model linked to the ‘Orange” DM model.
The core capability of data management is data lifecycle management.
This capability transforms raw data into meaningful information and, by that, delivers value to data management stakeholders.
Directional capabilities define a direction and create a framework for data management. Data governance and business architecture are examples.
Supporting capabilities enable core and strategic capabilities.
When you map the DAMA-DMBOK2 to this model, you get some questions regarding the Knowledge Areas of the DAMA-DMBOK2 model.
For example, data architecture and modeling according to the TOGAF® Standard, a framework for Enterprise Architecture, belong to data architecture. IS (information systems) architecture consists of data and application architecture. For digital data, it is hardly possible to split data and applications. However, DAMA-DMBOK2 does not take an application architecture in scope. Instead, it comes up with Data Integration and Interoperability and DWH&BI, which can be viewed as a part of the IS architecture. BI can also be mapped to data analytics capability. I did not map a Reference & Master Data Knowledge Area; for me, processing of any data type must have the same set of capabilities.
So, even this high-level mapping confirms the key principle: data management capabilities have dependencies that cannot be avoided. Let’s take four capabilities from the poll and consider their dependencies with other DM capabilities.
According to the DAMA-DMBOK2 view on “data governance,” this capability should develop and control the implementation of the DM operating model, organizational structure, processes, roles, and policies for other data management capabilities. In other words, data governance establishes a data management function by implementing a data management framework.
So, when a company starts its data management with a data governance initiative, it means the following. The company must already have some data management capabilities in place. Data governance will help transform these capabilities into business functions. Data governance can also initiate the implementation of other capabilities required to meet business goals.
Information systems architecture, security, IT infrastructure, and metadata management are capabilities that must exist, having a formal or informal status, to ensure data lifecycle management.
Some companies start their data management journey with a data quality capability. Inefficient business decision-making due to poor data quality is a strong “stick” for this initiative. However, many companies fail in this initiative. One of the biggest reasons is that the successful implementation of a data quality capability requires several other capabilities.
Two core data quality activities are investigating and resolving data quality issues and building data quality checks to prevent these issues. Impact and root-cause analysis are two methods that enable these tasks’ performance. You need to perform these analyses by investigating data movements and transformations at the physical level. The set of capabilities like information systems architecture, data modeling, and metadata management enable data lineage. In turn, data lineage at the physical level is necessary to enable data quality activities.
So, again, we come to the same conclusion: a data quality initiative can hardly be possible without other data management capabilities.
Using ML and AI turns out to be a mantra for many companies. However, without having a solid data management foundation, attempts to get value from data by using ML and AI and establishing self-service analytics can quickly fail. Describing the data you use in the models requires data modeling. The “garbage in, garbage out” principle demonstrates the uselessness of an analytics initiative without having the data quality capability in place. But we´ve just discussed that the data quality needs multiple other capabilities.
So, to succeed with the data analytics initiative, a company must create a solid “data management foundation.”
The poll’s result for this capability is the most challenging among all others. I was surprised that only 11% of respondents indicated it as the first priority initiative.
I assume the key reason for that is the misunderstanding of the metadata and metadata management concepts. Every company has established data pipelines. Along these pipelines, data is being transformed and integrated. It can’t be done without managing metadata. I think many companies have this capability in the ad-hoc format. The poll question meant establishing metadata management as a business function. Maybe it was the reason for low response.
Another reason for underestimating the role of metadata is narrowing the scope of metadata to technical and, maybe, operational ones. Many professionals don´t realize that data models, data lineage, business terms, and definitions are metadata. The majority of data management artifacts are metadata. A company must have a clear strategy to store and map the data management deliverables.
When I developed the metamodel of data lineage, I realized that the data lineage metamodel, in a broad sense, represents the knowledge graph of data management outcomes. In this respect, metadata management unites all other data management capabilities.
The analysis summary
I think the demonstration of the relations of four core data management capabilities with each other and other capabilities convinced you of the veracity of the statement made earlier in this article. All data management capabilities are linked with each other. Implementation of one capability requires the implementation of others.
The key challenge is to find out the links between these capabilities and implement them in the appropriate order. The key reason for that is that outcomes of one capabilities serve as input for others.
An integrated approach to implementing core data management capabilities
This integrated approach is a distinguishing feature of the “Orange” DMF. In this short article, I can share only a high-level example. Figure 3 demonstrates one of the methods that constitute this framework.
Figure 3: Consequent steps in implementing core DM capabilities.
This figure represents the high-level top-down approach to deliver artifacts related to 5 core data management capabilities: data modeling, information systems architecture, information systems architecture, etc.
Usually, I recommend starting by analyzing information requirements expressed in various reports and dashboards. Report analysis and governance are complex, considering the hundreds and thousands of reports produced within a company. Developing business models leads to developing data models. Documenting business processes and application flows is the starting point in data lineage development.
Strictly Necessary Cookies
Strictly Necessary Cookie should be enabled at all times so that we can save your preferences for cookie settings.
If you disable this cookie, we will not be able to save your preferences. This means that every time you visit this website you will need to enable or disable cookies again.
3rd Party Cookies
This website uses Google Analytics to collect anonymous information such as the number of visitors to the site, and the most popular pages.
Keeping this cookie enabled helps us to improve our website.
Please enable Strictly Necessary Cookies first so that we can save your preferences!