Graham Pearman, Data Governance Lead
Billigence have seen an upward swing in demand for Data Governance solutions globally. The demand is primarily driven by the onset of tougher and more stringent regulations for data compliance. Coupled with this is a growing realisation that efficiency and profitability is increasingly dependent on the availability of high quality data to drive business intelligence.
When choosing a solution for Data Governance, it is important to find the right fit based on a organisation’s requirements and needs. This article explains the foundational aspects of data governance, how software can help to execute your data governance program, plus provides you with our top 5 Tools to look out for in 2022.
Table of Contents
- What is Data Governance?
- Data Governance 2.0
- Data Governance Frameworks
- About Data Governance Tools
- Key Must-Have Features of a Data Governance Tool
- Top 5 Data Governance Tools for 2022
What is Data Governance?
Put simply, Data Governance is the combination of people, processes, and technology working together to ensure you know what data assets you have, where they are and who owns them. From a cultural perspective, an organisation is living the values of Data Governance when its critical mass of employees and stakeholders realise that enterprise data is a corporate asset of extremely high strategic importance; if not, the most important thing it owns.
In reality, Data Governance is about getting a lot of people to do something they probably don’t want to or don’t have time for. For instance, being the accountable owner for a technology platform, being responsible for ensuring the quality of metadata, configuring data quality rules, or becoming a newly identified data owner will all have an overhead on people’s time.
There’s also an inescapable up-front effort involved in establishing operating models, ingesting and categorising data, and adopting new ways of working. It’s important therefore, to understand that whilst these activities can impact resources, especially early on; they also improve efficiency and decrease human error, thus upfront investment will always pay off in the long run.
Since these changes represent behavioural and cultural shifts, they cannot simply be foisted onto your employees. Before embarking on any data governance initiative, it is extremely important to understand that Data Governance is in effect, a long-term programme of transformation and it needs to be treated accordingly.
Data Governance 2.0
We often come across the perception that data governance is simply as a bureaucratic exercise done to satisfy regulators. On the contrary, at its root, the concept of Data Governance 2.0 is that governance should be an enabler, not a bureaucratic roadblock, and it’s important for an organisation’s visionaries and change champions to believe in and continually stress this fact.
For data, that means using governance to improve data access, timeliness and relevance of information for everyone in the organisation. A properly executed data governance program leads to the abundance of new insights, capabilities, and ultimately, excellent outcomes for your customers. A report from Capgemini showed that compliance with the EU’s GDPR has yielded a range of significant and unanticipated benefits, from increased consumer trust to better customer engagement and revenue growth.
Despite the hurdles, it is important to remind ourselves of the positive and seemingly unlimited business benefits that effective Data Governance can bring. Executed correctly, with the appropriate support and expertise, data governance helps organisations to shore up their processes, plus adds value, efficiency and effectiveness.
Data Governance Framework
A Data Governance Framework (DGF) documents the data rules, organisational role delegations and processes aimed at bringing everyone in the organisation onto the same page. Stemming from your Data Governance Strategy, the DGF will become the blueprint for achieving your organisational objectives. Not every DGF is the same and its important that the size and scale of each is appropriate for the organisation in which it is being implemented.
DGF Core Elements
1. Mission & Objectives: As with any change initiative, data governance should begin with a mission statement told by a visionary and in your organisations language. This is where you need your chief data officers or equivalent to be able to stand up and hold the mantle, appealing to the hearts and minds of your organisation.
You should be able to list clear objectives accompanied by relevant goals, metrics and sponsorship. It is important that Data Governance is not seen as a project but as a wider program of work encompassing wider aspects of data governance; for instance master data management, data accuracy, data processes, and data visualization.
2. Identifying and prioritising data assets: There is often a temptation to have an all encompassing data collection strategy, ingesting everything from the outset and then working out what’s important. This can often be a costly mistake since there are direct and indirect costs associated with ingesting data. An important question to ask is, which data is critical and start from there. Your journey should consist of small, fast incremental wins until you build enough momentum to scale your data governance initiatives more widely.
3. People and organisational bodies: A Data Governance Office, Steering Committee or equivalent body should be established with the authority to drive change. From a change management perspective there should be a clear commitment to data governance from each member, for instance, John Kotter’s change model refers to a ‘powerful coalition’ with an ’emotional commitment to change’ as an important ingredient in driving this type of initiative.
You’ll also need to identify your data stakeholders (e.g. technical and business stewards, platform owners, data owners, data users) ensuring these roles have been identified and verified. These stakeholders will need to organised into an ‘Operating Model’, a good place to start is with a simple RACI. Be prepared to understand the associated change impact and manage the messaging accordingly.
4. Processes and control mechanisms: These should map and define the triggers, rules and steps needed to be taken to create, maintain, and delete various assets across the data lifecycle. It should include policies for the management of personal data, and data security. This will be important in ensuring your can execute standard business processes, policy management, and lifecycle management.
5. Creating knowledge transfer and training programs: Once your Operating Model is established, various teams and stakeholders will need to be trained in how to use your new the data governance policies and technologies to ensure that consistent data standards and terminologies are used whilst improving data literacy along the way.
About Data Governance Tools
Data Governance software enables data practitioners to engage more effectively in data governance by providing the guidance, guardrails, and interfaces for effective data and information management plus management of customer data.
Modern data governance platforms are becoming more sophisticated, in particular in their use of AI and Machine Learning algorithms which are used to detect and categorise sensitive data, workflow based automation, data operations, data quality integration and unified cloud platform technologies.
Tool Readiness Checklist
Choosing a data governance tool requires some groundwork and there are certain steps that need to be taken before even considering procurement of a solution:
- A clear plan. Data governance strategies should identify which business priorities and overall requirements must be outlined. Based on this, you can clarify which subsets of data governance are in scope. For example, the requirement might just be a data catalogue solution while data quality is already covered.
- Primary users identified. It’s important to identify the champions that will be able to drive your initiative by creating momentum early on. This could be just one business team looking to leverage data for decisions, or an IT team with a specific series of use cases.
- As-is workflows and data architectures understood. Based on this, plus an understanding of your IT and Information and data security strategies, the required data governance tool can be decided on to be on-premise or cloud-based. The data governance team needs to build knowledge of its data models, where data resides, and take stock of existing integrations, the current volume of data, expectations of scalability, and existing connectors between various applications and services.
Key Must-Have Features of a Data Governance Tool
The key features you should be looking out for include:
1. Data Discovery and automatic classification
A data governance tool must be able to automatically discover and ingest data from organisation’s data lakes and data stores. This involves the ability to scan and profile systems and their various data structures across different technologies plus provide categorisation of this data. Leading solutions offer the ability of ‘guided stewardship’ which provides auto classification of data, including personally identifiable information as it lands in your data governance solution.
2. Data and Metadata management (MDM)
If your organisation doesn’t already have MDM in place, the solution you’re looking for must be able to track and manage this. Since metadata is data in the context of “who, what, where, why, when, and how”, a good data governance tool must have the ability to track and govern this information. The tool should be able to comprehensively answer the following types of context questions, ‘who’ created the data and who owns it? ‘what’ is the security or privacy level of the data? ‘where’ did the data come from? ‘why’ and for which business purpose are we storing this data? ‘when’ was the data created and last updated? and ‘how’ is the data formatted? string, integer etc..
3. Data ownership and stewardship capabilities
Your Data Governance solution must be able to clearly assign and manage data ownership across your organisation to promote data management activities, this should include the ability for your Data Owners to manage data management activities. Data stewards are in charge of maintaining data quality, typically across the six dimensions of data quality, accuracy, completeness, consistency, timeliness, validity, and uniqueness. An effective data governance tool should enable your stewards to manage these components through the use of dashboards, metrics and a task-based issue management system.
4. Data lineage
Data lineage tracks the origin of each data entity and helps to provide a visual representation of the up and downstream dependencies related to your data, information about any transformations it has gone through, and its movement within your organisation’s systems. This provides organisations with the ability to better manage system and configuration based changes. Since manually maintaining data lineage is cumbersome and error-prone, data governance tools must provide the capability of automatically tracing lineage.
5. Workflow Automation
Governance solutions should offer workflows that enable your organisation’s data custodians to orchestrate the entry, validation, and approval of data changes using repeatable business processes. This will help to optimise your processes, minimise inefficiencies, and establish governance to the day-to-day management and use of master data whilst ensuring agile approval and communication of governance policies, workflows, and articles.
6. Self-service tools
Self-service tools are essential for organisations whose data governance goals are aligned more toward business teams. These tools must provide clean, intuitive and clutter-free representations of your data and other assets with the ability to easily search and navigate through your catalogues, plus the ability to create user-specific experiences, with reporting and alerting capabilities rolled into it.
7. Clean User Interfaces and CX
The driving force of any effective, enterprise-grade data governance tool is the graphical representation of its assets. This includes the ability to cleanly visualise complex data concepts such as data lineage, data relationships, and data quality. Data governance tools must also visually illustrate information about policies, issues, and data pipelines. Ideally your solution should be customisable to suit your organisation’s branding.
8. Business & Report glossaries
A foundational aspect of your data governance plan is the creation of common data definitions and formats. Creating a common glossary of business terms and acronyms helps maintain consistency. For example managing numerous nuanced descriptions of the same term. Use of a business glossary helps to formally document these in way that makes it easier to converge terminologies and come to an agreement about the adoption of one single definition that best meets business needs. Intelligent governance tools also provide the capability of easily importing business terms you may have already constructed, e.g. on your website or intranet. Subsequently, your solution must allow you to be able to relate these terms to your data dictionary.
An effective Data Governance solution will have the ability to integrate with your BI visualisation tools, e.g. Power BI and Tableau and be able to build out report artefacts into an easy to consume Report Catalogue.
9. Compatibility with existing systems
Data governance tools are often bundled with other tools. Before deciding on a data governance tool, it is advisable to take stock of your existing capabilities and toolsets, and pay only for those features that complete your data governance framework and compliment your strategic objectives. This means that the tool you pick must be compatible and even better complimentary with your existing suite of products.
10. Compliance and Policy Management
The data governance tool must provide for external and internal audits, especially if compliance is one of the key goals of governance. Data-related regulatory laws such as GDPR require secure storage and maintenance of data, especially data linked to Personal Information / Personally Identifiable Information. Documentation of the data dictionary, data rules, and access controls go a long way in proof of compliance. This can be done by generating intuitive reports or by allowing for specialised roles for external auditors with limited view capabilities to explore the data.
Top 5 Data Governance Tools for 2022
Based on our industry insights, these are the top 5 Governance solutions to look out for in 2022.
Collibra is an enterprise-focused data governance platform that is known for its automated data governance and management solutions plus promotion of data stewardship. Billigence partnered with Collibra as an approved vendor in 2019 because of its ability to meet all of the aforementioned aspects of Data Governance. Moreover, in our opinion, Collibra is the only vendor that truly understands and offers its customers support with the journey and transformational aspects of data governance.
- Data cataloging: Powered by machine learning, Collibra’s Edge Server crawls through registered data sources to profile data and create the data catalog quickly and efficiently.
- Metadata management: Collibra’s data catalog allows users to discover, extract, and deliver metadata in a way that allows effective management of all of the ‘who, what, where, why, when, and how’ questions of MDM.
- Data integration: The Collibra Marketplace provides access to hundreds of supported connectors for metadata, ETL and BI ingestion. This makes it a highly compatible solution to fit into your organisation’s ecosystem.
- Data ownership and stewardship capabilities: Collibra is best-in-class for its provision of workflow automation, governance and stewardship tasks.
- Visualisation: It provides end-to-end lineage visualisation for both business and technical lineage, providing assistance for IT-centric and Business users alike.
- Lineage Harvesting: Collibra Data Lineage automatically maps relationships between data to show how it flows from system to system and how datasets are built, aggregated, sourced, and used.
- Business glossary: Collibra includes a comprehensive business glossary capability which allows you to easily bulk import your organisation’s existing terms and link these to your data elements.
- Flexibility and compatibility: Collibra provides contextual search, intuitive workflows, and data dashboards. It provides several report templates as well as customizable ones.
- Compliance audit readiness: It supports BCBS239, GDPR, CCPA, and other compliance efforts by tracking data flows.
- Data policy management: It allows users to create, review, and update data policies, including the ability to add your own classification attributes.
- Data security: All data profiling and data quality analysis is performed on your organisation’s Edge Server which means data never leaves your network and sensitive data is flagged and managed accordingly. User access is managed via SSO.
- Data Usage: Provides a comprehensive data usage registry which allows you to audit who had access to which systems between which dates, plus integrates with solutions such as ServiceNow to create seamless end-to-end data access.
- Roll-Based Access: Collibra uses a highly flexible Community and Domain structure, plus RBAC model to manage views and access for users.
Usability: Billigence’s Data Governance Community of Practice have accrued the necessary skills and expertise to deploy and execute Collibra quickly and efficiently in a way that allows our customers to adopt and sustain there investment.
Price: Contact Billigence today for more information about pricing and demonstration. Collibra offer a free 14-day trial.
2. Alteryx Connect
Billigence also holds a partnership with Alteryx and is vastly experienced with utilising Alteryx Server to establish Data Governance solutions for our customers.
Alteryx Connect also provides: glossary, metadata store, lineage tracking, certification, and the ability to permission access to data asset definitions.
It is lighter from a compliance perspective but arguably has greater usability and ease of use. Rather than being based on compliance and regulation – AC ‘allows an information-based organisation to crowdsource sharing of datastores, along with their business context and technical metadata’.
Alteryx Connect enables you and the teams you support to discover relevant data assets quickly and explore the data to gain context, so everyone clearly understands their quality, lineage, and certifications.
- Asset Catalog: Assemble all your information in one place by collecting metadata from your information systems
- Business Glossary: Define your standard business terms in a data dictionary and link them to assets in the catalog
- Data Discovery: Discover the information you need to drive better business outcomes through powerful search capabilities
- Data Enrichment and Collaboration: Annotate, discuss, and rate information assets to provide business context and enable your organization with the most relevant data
- Certification and Trust: Understand the trustworthiness of information assets through certification, lineage, and versioning
Usability: Alteryx Connect relies on utilisation of the Alteryx server. Contact Billigence for a demonstration of capabilities.
We see Ataccamma as having a strong presence in the data management and data quality market but currently a niche player when it comes to Data Governance. Nonetheless, we anticipate this to improve soon on that front because of its recent acquisition of the Tellstory platform. Ataccama touts itself as a self-driven data management and governance platform as a service (PaaS). It provides AI-driven automated capabilities.
- Data cataloging: It uses automated data discovery to create data catalogs.
- Data management: It provides MDM capabilities. It also supports reference data management and data integration.
- Metadata management: Ataccama has a data profiling feature.
- Data ownership and stewardship capabilities: It provides role-based security, and is tailored specifically for data stewards, analysts, data engineers, and data scientists.
- Self-service: It provides overviews of entire datasets with shareable progress reports.
- Business glossary: Ataccama provides a business glossary along with the data catalog.
- Flexibility and compatibility: It works with on-premise, cloud, and hybrid environments. Ataccama works with many types of data stores, including big data ones such as Spark, AWS, MapR, Google, and Azure.
- Compliance audit readiness: It provides the entire audit history.
- Data policy management: Ataccama provides automated policy enforcement and automated assignment of business rules.
Usability: Users report that the platform often requires a consultant from Ataccama for set up, updates, and deployment. While automation works greatly in its favour, it lacks in the intuitive visualisation department.
Alation has evolved in the last year to showcase data governance and privacy solutions they power, which can be achieved by third-party integrations. They have started highlighting their support for a broad range of data intelligence use cases by combining machine learning with human insight to tackle the most demanding challenges in data management.
- Self-service: Wiki-style articles, which help non-technical users
- Data management: Advanced query log analysis, tagging, profiling and masking
- Advanced collaboration features: Use of @mentions and threaded conversations to drive data democratisation
- Flexibility and compatibility: Users can query the underlying data with a built-in query editor
- Self-service: Popularity based on query logs provides visibility on what data assets are accessed frequently
Alation customer stories highlight how enterprises leverage the data catalog as a platform for data search & discovery, data governance, data stewardship, analytics, and digital transformation. On the downside, users report that Alation lacks some of the core workflow functionality to automate business processes such as asset certification, onboarding data assets, proposing new business terms, and QA activities.
There is also limited technical lineage and integration functionality are based on partnerships (not native), which causes limitations within complex multi-system environments. Moreover, it lacks enterprise-grade security, native PI discovery and permission-based features and customers bear the burden of hosting and managing the infrastructure, security, etc. of their Alation environment as an on-cost.
Informatica’s product suite includes Axon Data Governance, Enterprise Data Catalog, Informatica Data Quality, Data Privacy Management, and Master Data Management. They promote the platform as an integrated suite of products across the full data lifecycle. Traditionally Informatica has focused on technical users, and they have very strong relationships with IT teams. However recently, Informatica has shifted its messaging to emphasise data democratisation and business value promoted via its use of Version 7.2.
- Data cataloging: Informatica automatically scans across multi-cloud platforms, BI tools, ETL, third-party metadata catalogs, and data types to create a data catalog.
- Data management: It encapsulates master data management and AI-based integration patterns.
- Metadata management: It scans and indexes metadata.
- Data lineage: It supports automated data lineage tracing. It tracks data movement, from high-level system views to granular column-level lineage, and gets detailed impact analysis.
- Business glossary: It supports a business glossary of data definitions.
- Compliance audit readiness: It breaks down the silos and engages IT, security, and business teams to ensure that the data meets compliance such as GDPR.
Usability: Informatica has a steep learning curve. Users report that the solution is not easy to deploy and integrate with other tools and in our experience, user adoption and business uptake is limited precisely because of its IT-centric and unbalanced focus. Whilst Informatica offer its AXON Data Governance as a free add-on to its suite if products, use of the Axon Data Governance tool may turn out to be more expensive than other stand-alone tools.
With exponential growth in data worldwide, it is inevitable that the stakes for managing data effectively will grow every higher. Data Governance solutions such as those listed here allow companies to maintain data integrity and consistency as they scale up and grow. Consistent with the principles of Data Governance 2.0, Billigence believe that data drives business intelligence, empowers decision making, and self service which is why we believe investing in a data governance solution is essential for organisations of all shapes and sizes.
Need help with Data Governance?
Contact us today to discuss your Data Governance needs with one of our experts.