The previous article reviewed a common approach to choosing an IT tool. In this article, we will discuss the following:

  • Data management definitions in the IT tools context
  • Business needs and requirements for a data management tool
  • Situation with commercial-off-the-shelf (COTS) data management tools (based on the analysis of 24 tools) in terms of:
    • IT tool types
    • Deployment options
    • High-level functionalities
    • Industry solutions

Data management definition

We’ve already discussed the dependency of data management definitions on the contexts in the previous article. In the IT tools context, we will use two definitions of data management: “narrow” and “broad,” as shown in Figure 1.

Figure 1: The “narrow” and “broad” data management definitions.

Figure 1: The “narrow” and “broad” data management definitions.

The “narrow” definition of data management relates to a core capability to manage a data lifecycle. The “broad” definition adds to the data lifecycle management multiple data management capabilities and corresponds to the DAMA-DMBOK2 vision on data management.

Business needs and requirements

The first step in choosing any IT tool is the definition of business needs and requirements. A company as a whole entity can hardly have unified needs regarding a data management tool. Various internal and external stakeholders might have entirely different needs and requirements. Therefore, it is essential to consider the needs of all relevant stakeholders. In this series, I will consider two generalized groups of internal stakeholders: business and data management.

The group of business stakeholders includes C-suite members and business professionals from different business units. Various data- and IT-related professionals represent the data management stakeholders. It is worth mentioning that even within these aggregated groups, the requirements of individual stakeholders may differ.

Let’s investigate the needs of these two stakeholder groups regarding data management tools.

Business stakeholders may have high-level needs. The only that is matter is getting the required information at the right place, at the right time, and of the quality necessary to make correct decisions. The goal to “build a data-driven organization” summarizes these needs.

On the contrary, data management and IT professionals may have multiple needs and requirements. These needs generally relate to a company’s ability to optimize and implement a data lifecycle in various data pipelines using multiple data management capabilities.

A data lifecycle is the set of processes that move and transform data from the moment of its creation to the moment of its archiving and/or destruction. A data lifecycle has no standard set of processes. Various sources define these steps differently.

Figure 2 demonstrates the key common steps in a data lifecycle.

Figure 2: Common data lifecycle steps.

Figure 2: Common data lifecycle steps.

Each of these steps can be broken down into some sub-steps. For example, data processing may include data extraction, validation, cleansing, transformation, integration, loading, etc. The requirements for the data lifecycle processes form the basis for functional requirements for a data management tool and correspond to the “narrow” data management definition.

Functional requirements for managing a data lifecycle can be extended with the requirements for additional capabilities like data security, quality, data analytics, and so on. In this case, these requirements fit the “broad” definition of data management.

Non-functional requirements are company-specific and include IT tool performance, scalability, usability, maintainability, and reliability requirements.

Overview of data management tools

Several sources provide lists of data management tools: Solutions Review, Capterra, Forbes Advisor, and so on. For this article, I will use the list of providers recommended by Solutions Review, which references the following tools & providers:

1010Data, Amazon Web Service, Ataccama, Cloudera, Collibra, Commvault, Druva, Google Cloud Platform, Hewlett Packard Enterprise, Hitachi Vantara, IBM, Immuta, Informatica, MarkLogic, Microsoft Cloud or Fabric, Oracle, Precisely, SAP HANA, SAS, SingleStore, Snowflake, Talend, Teradata, and TIBCO.

All challenges with choosing IT tools that I discussed in the first article of this series are also applicable to data management tools.

IT tool types

Different vendors use quite various names to describe/classify their tools. Many of them use the title “platform.” First, let’s define this term. In this article, I will use the following definition: “A platform is a group of technologies that are used as a base upon which other applications, processes, or technologies are developed.”

Figure 3 presents the titles used to describe the 24 data management tools mentioned above. However, it is impossible to find the underlying definitions of these titles on the vendors’ sites. So, it may happen that the same title, for example, “platform,” has multiple meanings depending on a vendor’s viewpoint.

Figure 3: The titles of data management tools.

Figure 3: The titles of data management tools.

We can see that 16 of 24 vendors use the term “platform.” However, some vendors use further clarification of this term. For example:

  • Ataccama is a “data management platform”
  • Collibra is a “data governance platform”
  • Immuta is a “data security platform”

“Data integrity suite” (Precisely) and “data fabric” (Talend) are examples of other approaches to describing data management tools.

So, this brief analysis demonstrates that no aligned titles of data management tools and corresponding definitions exist. It makes it impossible to understand and differentiate underlying functionalities based on a title name.

Deployment options

A company can require various deployment possibilities: on-premise, cloud, or hybrid. Vendors of the tools mentioned above provide different deployment options. Figure 4 demonstrates the analysis results for “on-premise” and “cloud” solutions.

Figure 4: Data management tools deployment options.

Figure 4: Data management tools deployment options.

I did not indicate which vendors provide hybrid solutions, as sometimes it is difficult to find the required information on vendors’ sites.


As discussed above, we consider two definitions of data management, “narrow” and “broad.”  Various functionalities correspond to these definitions.

Let’s start with analyzing the “narrow” viewpoint functionalities.

All the tools mentioned above include functionalities that assist in managing all or some of the data lifecycle steps and building data pipelines. Data integration is the most shared capability among all providers. Multiple tools include data warehouse and data lake capabilities. Commvault, Druva, and Immuta tools are exceptions; their key capabilities are data protection and security.

Let’s proceed with the “broad” viewpoint functionalities.

All tools also offer a range of additional data management capabilities. Figure 5 demonstrates examples of these capabilities. Unfortunately, it is not easy to discover information about tool functionalities at the vendor sites. That is why the analysis provided below may be incomplete. Another challenge of this analysis is the unclear definitions of the data management capabilities; for example, “data governance” or “data quality” capabilities can include quite different functionalities.

Figure 5: Additional data management capabilities.

Figure 5: Additional data management capabilities.

Industry solutions

Vendors provide information about industries in which their tools have been implemented. They call it a “solution per industry.” It is impossible to find out information about the differences between these solutions.  Figure 6 demonstrates the ten most common industries per number of vendors.

Figure 6: Ten leading common industries.

Figure 6: Ten leading common industries.

The most competitive industries are financial services, government and public sectors, health and life science, and retail.


A company must use a sophisticated approach in choosing appropriate data management tools and perform multiple steps, including the following:

  1. Identify the business needs and requirements of various data management stakeholder groups

Various stakeholders have quite different needs and requirements that must be aligned upfront.

  1. Align the internal definition of data management

It will allow a company to understand its requirements and define the initiative’s scope. The clarification of the definitions will assist in overcoming challenges associated with unaligned terminology used by various vendors.

It will also help simplify the tool selection and make the implementation feasible. Usually, the implementation of a data management tool focuses on the core capability: establishing or optimizing a data lifecycle and corresponding data pipelines. Some additional capabilities like master data management or data quality can be realized by using some other tools.

However, we’ve seen that some tools identified as “data management” focus on specific capabilities like data security. So, to make the implementation feasible, a company must clarify and align the used terminology.

  1. Define the current and future states of the enterprise (data, application, technology) architecture and the scope

This action will assist in identifying the requirements for deployment options. Different vendors provide multiple opportunities for deployment. However, the chosen solution must fit the company’s resources and business plans.

It will also help clarify the “enterprise” scope for data management tools. A company can implement a data management tool for a part of an application landscape or the whole enterprise. Some providers position their solutions as “enterprise global” ones. Other solutions can be applied to a part of the application landscape. In any case, a company’s tool selection must align with the future state architecture.

  1. Perform a detailed investigation and comparison of data management tools’ functionalities

As I mentioned, the information on the providers’ sites is not always easy to discover and comprehend. It is almost “mission impossible” to compare the functionalities of various providers because of unaligned terms and different presentation approaches. Tool selection is a time and resource-consuming exercise. A thorough investigation will assist in choosing appropriate solutions that fit a company’s needs and resources.