A practical and pragmatic approach to implementation of data management that delivers quick wins is one of the key challenges of any data management professional. Sooner or later, you will deal with this at one point in your career.

In the series of presentations Practical implementation or optimization of data management with the “Orange” model, I share with you my practical experience of the past 10 years. This experience has led me to developing a new model and practical method for implementation and optimization of data management. This method is a collection of techniques and templates that can be used for performing various tasks related to the development and optimization of data management in your company.

Using the “Orange” model for developing and performing a data management maturity assessment

My experience with data management/governance (DM/DG) roles started over ten years ago when I designed and implemented a data management framework. At the time, the topic seemed pretty straightforward. But, the more experience I gained in data management, the more mysterious and complicated the topic of data management/governance roles started to become. I wrote an article on this topic a year ago, expressing my concerns regarding the standard approach and offering some solutions. During the past year, I have discovered some new challenges to consider while designing roles.

In this article, I would like to share my vision of the challenges with the current common approach, discuss key factors that should be considered, and share practices in developing the set of roles.

Challenges with common approaches

In my opinion, there are several challenges associated with the existing approaches to role design which I have listed in Figure 1:

Figure 1. Challenges with common approaches to design DM/DG roles.

The number of roles

Different publications about DM/DG roles introduce a big ‘zoo’ of the roles. Sadly enough, even publications of the DAMA International present a huge number of roles: 120, to be precise!!! The biggest challenge is the alignment between these roles and processes, the tasks to be delivered, and the artifacts to be produced. The following associated challenge is an unclear relation between roles and enterprise size.

No clear factors of influence

While talking to many data management professionals who implemented DM/DG roles, I often get the impression that they copy roles from well-known sources without analyzing factors that may influence the design pattern of roles. This approach causes the next challenge associated with unaligned names and accountabilities of the role.

Unaligned terms

Once, I heard a colleague proudly say: ‘we have implemented roles of stewards and custodians.’ Linguistically, the words ‘steward’ and ‘custodian’ are synonymous. This is only one of the examples of blindly copying the sources. It happens because there are no clear guidelines on how to design roles that match the needs and reality of the company.

No clear guidelines to match the roles and companies’ reality

Take, for example, different steward-related roles introduced in DAMA-DMBOK2: ‘data steward,’ ‘data custodian,’ ‘chief data steward,’ ‘business data steward,’ ‘coordinating data steward,’ ‘executive data steward,’ ‘data steward facilitator,’ ‘technical data steward.’ What rules can the company follow to choose the ‘just enough’ roles of stewards, and what is the proper context for this ‘zoo’ of stewards?

Let’s look at factors that a data management professional should consider when designing data management/governance (DM/DG) roles.

Key factors that influence the design of data management/governance (DM/DG) roles

I will discuss seven key factors that influence the DM/DG roles shown in Figure 2. If I explain these factors in-depth, this will become a book rather than an article. You can find more information in my video presentation on this subject or contact me for one-on-one advice.

Figure 2. Key factors that influence DM/DG role design.

Let’s take a brief look at each of these factors.

  1. Types of data stewards

Figure 3. Types of data stewards.

The idea of data stewardship derives from the concept of data ownership. The company, as a whole, owns data. The company delegates data-related tasks to different types of data stewards. I want to stress that DAMA clarifies that steward and custodian are synonymous. DAMA specifies a data steward as a person or a group of persons that ‘represent the interests of all stakeholders and must take an enterprise perspective to ensure enterprise data is of high quality and can be used effectively.’ DAMA also specifies different types of data stewards. Data stewards can be split into three categories depending on their professional background: business, data management, and technical. Using this approach, you may assign these roles to every employee who deals with data. Data steward roles can be either formal or virtual. Formal means that you create a new functional role within your organizational structure. A virtual role can be assigned to the already existing functional roles.

  1. Structure of data management (DM) capabilities

In the “Orange” Model of DM 101 series, I have discussed a set of key data management (DM) capabilities: data chain, data management framework, data quality, data modeling, and information systems architecture. Four dimensions enable each of these capabilities: processes, roles, data, and tools.

Figure 4. The influence of the DM capability structure on DM/DG roles.

The “Orange” model offers to split DM/DG roles along these dimensions.

For example, ‘Business process owner’ and ‘System owner’ correspondingly relate to the ‘processes’ and ‘tools’ dimensions. The ‘Data’ dimension will describe the accountabilities of the data owner/data user roles. The dimension ‘Roles’ will clearly distribute roles within the organizational hierarchies. Accountabilities of these roles will simultaneously depend on their location along data chains.

The location along data chains

The data chain describes the path of transforming raw data into meaningful information.

In Figure 5, you can see the relationships between the roles, as we have just discussed, and their location along data chains.

Figure 5. The distribution of roles along data chains.

Data chains are associated with one or more business processes. Therefore, the business process owner will be accountable for the business process along the data chain that belongs to his accountability. One or more systems and applications could be involved in the data processing. Each of these applications will have one application owner. Data owners and data users will be accountable for data. We will discuss their accountabilities later. All the roles mentioned above will be assigned to business data stewards. Different data management capabilities enable data chains. All types of data stewards will perform processes related to these capabilities and deliver corresponding artifacts. The data architecture of data chains will vary and impact the design of roles.

Data architecture style

Data architecture will influence data-related roles, business processes, and systems. There is a big difference between the canonical and the big data platform architectures, as shown in Figure 6.

Canonical architecture

Many companies still have this form of architecture. There are far too many relationships between different sourcing and consuming applications. In professional jargon, they often call such type of architecture ‘spaghetti architecture.’

Figure 6. Different data architecture styles.

Big data platform

Data from different source systems enter the central big data platform. This platform has different data domains. Data is processed within the platform and then distributed to different users. The key question with such platforms is the location where data will be integrated and transformed. Will it occur within the platform itself or on its way to consumers? The answer to this question will also influence the specification of roles.

Data mesh platform

The data mesh platform is also related to big data architecture. In this case, two different data domain types are being organized: sourcing and consuming. Each domain combines data in the sourcing system and the big data platform or the big data platform and the sourcing system. Within each domain, data is being processed according to the business requirements of this domain.

Different architecture styles will significantly influence the distribution of accountabilities of data owner and data user roles. The architectural style will also affect data modeling and solution design patterns.

Data modeling and solution design

In the canonical approach, data model design, and solution design belong to different continuums according to TOGAF 9.2, the leading Enterprise Architecture guide. Enterprise architecture includes four interrelated architectures: business, data, application, and technology. Data architecture delivers conceptual, logical, and physical data models. Solution architecture should implement physical data models into practice. The new approach, on the contrary, unites data model design and solution design. Conceptual, semantic, and solution data models should be designed simultaneously in one process. Data management and technical data stewards will have different accountabilities and deliverables depending on the approach. It will also affect data management-related processes. This challenge leads to another challenge associated with defining business and data domains.

Business and data domain definition

DAMA-DMBOK 2 assigns the approval authority of a data steward to its domain. The challenge is to specify the definition of ‘domain.’

There are at least three possible approaches to specify domains, as shown in Figure 7.

Figure 7. Different approaches to specify the term ‘business/data domain.’

The first approach relates to the concept of new data creation. This is a complex topic that I will cover in one of my master classes in the future. The key idea looks like the following. Data flows along data chains. On its way, data can either change or not. The first challenge is to specify the conditions of data changes. Usually, master and reference data stay unchanged, while transactional data will be changed. Metadata will ensure the changes in data. Depending on the data type, the accountability of data owners may be specified differently.

The second approach is focused on data content. For example, customer data is a subject area usually associated with conceptual models. A company may assign data ownership based on the data subject area. The approach of business architecture to assign data ownership based on business capability domains might also be the case.

The third approach is less common. It allows specifying data ownership based on organizational structures.

In reality, the combination of these approaches can be used within one enterprise. The scope of the enterprise will also affect the design of roles.

Enterprise scope and/or company size

When a company designs a set of roles, the scope of the data management initiative and the company’s size should be considered. It will influence the complexity of the roles in the data management organizational structures. Assume a business unit becomes a data owner for specific data sets. Then, within this business unit, the ultimate accountability and corresponding responsibilities for data ownership will be split between the business unit manager and staff.

After you have analyzed all of the relevant factors for your company, the final step is to design roles.

Design the set of data management/governance roles

My practical advice is not to copy the existing solutions and make a set of roles as simple as possible to meet your company’s reality.

The “Orange” model considers data management as a business capability. Four dimensions enable capability: processes, roles, data, and tools. At the final stage, it recommends linking roles to data management processes and deliverables (data)—an example of such a mapping you can see in Figure 8.

Figure 8. An example of the mapping between roles, processes, and deliverables.

For more details, please, consult my book ‘Data management toolkit.’

In the next and final article of the “Orange” Model 101 series, we will discuss how to specify KPIs and measure data management performance.

For more insights, visit the Data Crossroads Academy site: //academy.datacrossroads.nl.