If I had an hour to solve a problem and my life depended on the solution, I would spend the first 55 minutes determining the proper question to ask, for once I know the proper question, I could solve the problem in less than five minutes.

Albert Einstein

Several months ago, I joined the webinar ‘How to establish a sustainable solution for data lineage’ organized by AIM Software. The discussion went around Data lineage(DL) requirements stipulated in the BCBS 239 (Basel Committee on Banking Supervision's regulation number 239: Principles for effective risk data aggregation and risk reporting).

As a result of the discussion, I reached the following conclusions:

  • BCBS239 does not strictly require Data Lineage
  • There is no agreed vision in the professional community on what Data Lineage actually means.
  • There are no tools available to cover all Data Lineage

So the question ‘What does it mean to be BCBS 239 compliant?’ remained unanswered.

In the article below, I would like to share some of my findings on the following questions:

  • How can you define BCBS 239 compliance?
  • What are the actual requirements?
  • Which components constitute the ‘Data Lineage in BCBS 239 context’?

How to define "BCBS 239" compliance

The first of my findings was that the text of BCBS 239 is not just a unique source of the requirements. What you have to take into account are questionnaires for self-assessment and the content of Thematic Reviews distributed by ECB. If you compile all mentioned documentation, you face a wider range of requirements than those in the paper.

What else is obscure is who is responsible for the estimation of the compliance level. On the one hand, a bank itself executes the self-assessment, but it is still necessary for the Regulators to confirm your level. So, which party takes the ultimate responsibility, and who is the ultimate decision-maker?

The next enigma is: how is the level actually being measured? There are no existent recommendations for any sort of measurement system. The evaluation remains subjective and inconsistent. The task of creating some measurement system becomes a point on your "To do" list.

In effect, to define BCBS 239 compliance, you just take the following steps:

  1. Define what is really required.
  2. Define whether you are the ultimate decision-maker on your level of compliance.
  3. Define and implement a system to measure compliance.

What are the actual requirements?

I have seen different attempts to transform BCBS 239 requirements into action. The issue is that BCBS 239 has a matrix structure; on one side: are the 14 Principles, and on the other: are 4 topics (Overarching Governance, Risk Data Aggregation Capabilities, Risk Reporting Practices, and Supervisory Review).  And you need to define what to do. ‘What’ means in which business area you have to act.

To make it clear, you need to analyze the matrix. As a result, you are challenged to proceed in the following business areas: Data / Information (including Reports as a container of Information), Systems and Technology, Processes, and Organization (including Governance).

Which components constitute ‘Data Lineage in BCBS239 context’?

First, let’s take a look at some definitions taken from DAMA DMBOK Guide.

Data integration architecture defines how data flows through all systems from beginning to end. […] Data lineage and data flows are also names for this concept.


Information value-chain analysis defines the critical relationships between data, processes, roles and organizations, and other enterprise elements.

(par. 4.3.1)

Information value chain analysis artefacts:  Mapping the relationships between data, process, business systems, and technology


I believe the answer now lies close to the surface. In order to be compliant with BCBS 239, you have to deliver Information Value Chain with the following attributes: Data and Business- and Technical Metadata Flow between sources and end reports mapped to Business processes, Data Quality, and Other types of controls.

So, the goal is clear, but the execution still meets the problem of the availability of proper tools to combine and automate all Information Value Chain components.

For more insights, visit the Data Crossroads Academy site: //academy.datacrossroads.nl/courses/data-lineage-what-why-how/lesson/data-lineage-what-why-how/