In the previous articles of this series, we have discussed how to build a company-specific data management maturity assessment and how to benchmark the results for Data Quality. Now, it is time to look at the maturity of a data chain sub-capability.
In this article, I will share an in-depth approach for measuring and benchmarking the maturity level of the data chain sub-capability. Benchmark results used in this article have been based on the ‘Data Management Maturity Review 2019.’
We will cover the following four topics:
- Definition of the ‘data chain’ sub-capability and its dimensions
- Specification of indicators (KPIs) for measuring the performance
- Benchmarking results based on a set of indicators
- Development tips
Data and information value chain (data chain) sub-capability and its dimensions
The data and information value chain (data chain) is one of the five sub-capabilities of the ‘Orange’ data management model. Therefore, the maturity of the data chain sub-capability is essential for overall maturity. The explanation of the model you find in Data Management Maturity 101: What is a data management maturity assessment, and why does a company need it? The overview of the model is shown in Figure 1.
‘Data and information value chain is a set of actions that transform raw data into meaningful Information’ as stated in ‘Data Management Maturity Review 2019’.
The following dimensions enable a (sub) capability: role, process, data (input and output), and tools.
In Figure 2, each dimension of the data chain sub-capability is described in detail.
Figure 2. A detailed description of the data chain sub-capability dimensions.
DATA
In our context, ‘data’ stands for formal deliverables/artifacts of the data management sub-capability. The key deliverables of this sub-capability are related to the rules and roles that ensure the operation of the data management function.
There are several key deliverables of this sub-capability.
As I have discussed in the series of my articles about Data lineage, the concept of data chain is not clearly defined, and its meaning is not aligned within the data management community. At least six ideas intersect the data chain, data value chain, data flow, data lineage, integration architecture, and information value chain. Each company should specify its understanding of what the data chain is, its components, the level(s) of the data model used to document it, and the way it is documented. Therefore, the first deliverable will be a metamodel of the data chain, which should be applicable and feasible for the company.
The data chain documentation will be made as soon as the metamodel is specified. Creating a catalog of data chains as a first step is advisable. The concept of critical data elements will assist in prioritizing the documentation of data chains. After the initiative’s scope is limited to a reasonable level, the documentation of the data chain(s) will take place. Data lineage documentation can be done manually or automated, or you can choose a combination of both. The deliverables of all other sub-capabilities are required to connect them in the description of the data chain.
PROCESS
‘Process’ signifies a data management-related business process at different levels of abstraction.
The documentation of the data chain is a process that will involve the efforts of different professionals from multiple disciplines. The method of documentation consists of several tasks, such as the design, analysis, optimization, and documentation of the data chain. The important component of the process is the coordination of the activities of the multi-disciplinary teams.
ROLE
‘Role’ describes the participation of people in business operations. It can represent business units, functional jobs, a set of data management-related accountabilities and responsibilities (in the RACI context), etc.
The roles required to perform this capability depend on the definition and components chosen to document the data chain. Furthermore, most artifacts produced by other sub-capabilities will be needed to assemble the data chain. Therefore, all data management roles involved in delivering these artifacts will also be involved in the data chain-related activities.
The following data management professionals might be involved in the documentation of the data chain: data- and application architects, data modelers, and data analysts. Subject matter experts (SMEs) from businesses will also be involved. Database- and solution architects, designers, and engineers will represent IT professionals. In the case of the automated solution, the skills of IT consultants and developers for the automated data chain solution will be required.
TOOLS
‘Tools’ include information technology systems, applications, and resources required to perform the data management function, e.g., budget.
All tools involved in documenting the artifacts of all other data management sub-capabilities are also relevant for the data chain capability.
Data models are kept in data modeling tools. Business processes are documented in BPM tools. Business rules and ETLs should be documented in some repositories. The metadata repository is the key source of information for the data chain. Data chain tools will depend on the chosen way to document the data chain. In the case of a manual solution, a company has a great range of choices starting from MS applications (Excel, Visio, PowerPoint) to Axon by Informatica, Collibra, and Solidatus. Those companies who will choose the automated way of documentation also will have a great choice between ERWIN, Solidatus, Octopai, Collibra, SAS, and Informatica solutions.
Specification of indicators (KPIs) to measure the performance
Each sub-capability dimension described above can serve as a specific indicator (KPI) to measure performance.
Each company can create its maturity assessment by assigning maturity levels to chosen indicators.
I will demonstrate four indicators as examples. These indicators have been used as the foundation of our Data Management Maturity Scan:
Indicator 1 (Tools): The availability of an integrated tool to document data lineage
Data flows through the whole company and touches different departments. The documentation of the data chain is the effort of multi-disciplinary stakeholders from various departments across the whole enterprise. Therefore, the ability to share information is of very importance.
Indicator 2 (Process): Ability to deliver new data
Due to new regulations, updated information requirements come out quickly. The discovery and delivery of the corresponding new data remain a big issue for many companies. The more data sources a company has, the bigger the issue. The ability of a company to react quickly to new information & data requirements is one of the indicators of its maturity.
Indicator 3 (Process): Ability to explain data transformation
Regulatory bodies and audit functions come up very often with the requirements to explain the origin of data and the transformation it has undergone. This is one of the most challenging tasks for a lot of companies. The business value of the documentation of the data chain will be proved best if information about data transformations is accessible and transparent.
Indicator 4 (Process): Level of coordination between different stakeholders
As stated above, the documentation of the data chain requires a coordinated effort of different professionals across the organization. The data management function is accountable for the coordination of the activities of all data stakeholders. Therefore, the level of coordination demonstrates the level of maturity of the whole data management function.
Benchmarking results
Below are the benchmarking results for the four indicators mentioned above (KPIs). You can use these four indicators to benchmark your company’s situation quickly.
Each indicator has been evaluated at one of five maturity levels demonstrating the development level.
Conclusions
The results presented in Figure 3 have demonstrated the data management maturity for the data chain sub-capability. These results have led us to the following conclusions:
- Most companies put a lot of effort into documenting the data chain.
More than 70% of respondents have recognized the necessity of this activity.
- Yet, the ability of companies to discover and deliver new data remains a challenge for almost 40% of respondents.
- The situation with the ability to explain data transformation seems even worse. Almost 50% can hardly do it. Only 25% of respondents have been creating a foundation for doing it.
- The coordination of the activities of different stakeholders daily remains a challenge for almost 50% of respondents. Only 17 % of respondents are confident about a good level of coordination of stakeholders.
Development tips
To improve the situation with the data chains capability, companies should:
…align the processes of documenting information requirements and finding relevant data sources on the most granular levels;
…investigate and document application and data flows;
…apply the data lineage methodology to document the critical data chains.
For more insights, visit the Data Crossroads Academy site: