Data Stewards as Problem Solvers: A Practical Tool for the Job

A Tool for Data Stewards to Fix Common Data Problems

Data stewardship is an essential role in modern data governance. Every data-driven organization needs to have stewards who can quickly resolve data management problems and challenges. Data stewards facilitate consensus about data definition, quality, and usage. They guide activities to complete metadata, improve data quality, and ensure regulatory compliance. Stewards are also responsible to make recommendations about data access, security, distribution, retention, archiving, and disposal.

Unfortunately, typical data stewardship practices often don’t measure up to the importance of the role. All too frequently, data stewards are identified and assigned responsibility without the time and training to do the job well. When we designate busy people as data stewards without making time for them to do stewardship work we should not expect high-impact results. Nor should we expect success without training stewards about roles, relationships, and accountabilities related to data.

Along with time and training, data stewards need tools that help them to do their work. This article offers a simple tool to help diagnose data problems, find the path from symptoms to causes, and then get from causes to solutions. The tables below identify several common symptoms of data challenges that data stewards frequently encounter, grouped by ten core data management processes – naming data, defining data, designing data, managing quality, integrating data, accessing data, managing metadata, administering databases, managing systems, and governing data. Common causes of and solutions to data problems are identified for each process.                                          

To use the tool, begin by browsing the index of common symptoms to find those related to your data management issues. Then use the associated numbers to find each symptom in the process tables. Note that a single symptom is often listed in several of the process tables. Explore the processes, causes, and solutions to develop problem-solving ideas and plans.

Index: Common Symptoms of Data Problems

application integration difficulty
 integration difficulty
47
inefficient business analysis
26, 39
business rule violations in data
31, 40, 90
insufficient data storage capacity
72
can’t access needed data
52
lack of data definitions
10
complex system interfaces
48, 81
lack of trust in data
32
conflicting documentation
64
large change request backlog
20
confusing abbreviations
5
limited data sharing
49
confusing documentation
67
lost data can’t be recovered
80
corrupted data can’t be repaired
79
meaningless data definitions
12
data consolidation difficulties
98
meaningless data names
1
data not available when needed
58
missing documentation
62
data ownership conflicts
99
misunderstood data
15, 69
data privacy compromised
53, 55, 89, 93
multiple names & aliases
6
data retention/disposal uncertainty
97
need for data standardization
100
data security compromised
54, 88, 92
needed access not authorized
56
data-related compliance violations
94
needed features not implemented
78
difficult-to-use data
28, 37
non-unique data names
2
disaster recovery uncertainties
95
obsolete data definitions
13
enterprise reporting difficulty
46
obsolete permissions still active
57
excessive database downtime
76
outdated documentation
66
failure to meet business needs
21
overlapping and conflicting data
44
hard to find data definitions
14
poor application performance
83
hard to find documentation
65
poor data access performance
60
hard to find needed data
51
poor data quality
86, 91
hard-to-navigate databases
29, 59
poor database performance
30
hard-to-identify data
8
poor query performance
74
hard-to-navigate databases
29
poor structural integrity
19, 33
high level of data disparity
9, 17, 25, 43, 70, 85
poor update performance
75
high level of data redundancy
18, 27, 71, 84
shadow databases
41
inadequate metadata
87
shadow systems & databases
23
inappropriate use of data
16
spreadsheet proliferation
22, 42, 50, 61
incomplete data
36
structureless data names
4
incomplete documentation
63
territorialism inhibits data sharing
96
incorrect data
34
unanticipated growth problems
73
incorrect data definitions
11
unnamed data components
7
incorrect data names
3
unreliable database connections
77
incorrect reporting
24, 38


Naming Data


Symptoms

Causes

Solutions

1
meaningless data names
  • informal naming practices
  • lack of naming standards
  • standards
  • data naming taxonomy
  • data naming vocabulary
  • standard naming structure
  • standard abbreviations list
  • compliance incentives
2
non-unique data names
3
incorrect data names
4
structureless data names
5
confusing abbreviations
6
multiple names & aliases
7
unnamed data components
8
hard-to-identify data
9
high level of data disparity

Defining Data


Symptoms

Causes

Solutions

10
lack of data definitions
  • lack of data definition standards
  • poor data definition practices
  • lack of business participation
  • legacy databases
  • disparate metadata
  • data definition standards
  • data definition templates
  • data definition wiki
  • business/tech collaboration
  • data definition review
  • metadata repository
  • definitions system-of-record
11
incorrect data definitions
12
meaningless data definitions
13
obsolete data definitions
14
hard to find data definitions
15
misunderstood data
16
inappropriate use of data
17
high level of data disparity
18
high level of data redundancy

Designing Data


Symptoms

Causes

Solutions

19
poor structural integrity
  • poor modeling techniques
  • wrong choice of model type
  • poor business representation
  • excessive detail
  • insufficient detail
  • process-oriented design
  • application-oriented design
                                              
  • data model standards
  • E-R model guidelines
  • dimensional model guidelines
  • normalization guidelines
  • atomic data guidelines
  • aggregate data guidelines
  • subject-oriented design
  • consumer-oriented design
20
large change request backlog
21
failure to meet business needs
22
spreadsheet proliferation
23
shadow systems & databases
24
incorrect reporting
25
high level of data disparity
26
inefficient business analysis
27
high level of data redundancy
28
difficult-to-use data
29
hard-to-navigate databases
30
poor database performance
31
business rule violations in data

Managing Data Quality


Symptoms

Causes

Solutions

32
lack of trust in data
  • poorly defined DQ rules
  • missing DQ rules
  • absence of quality measures
  • absence of quality reporting
  • lack of accountability
  • incomplete/incorrect edits
  • DQ rules taxonomy
  • defined DQ rules
  • DQ metrics and measures
  • published DQ reports
  • regular DQ audits
  • designated DQ accountability
  • DQ tasks in project plans
33
poor structural integrity
34
incorrect data
35
untimely data
36
incomplete data
37
difficult-to-use data
38
incorrect reporting
39
inefficient business analysis
40
business rule violations in data
41
shadow databases
42
spreadsheet proliferation

Integrating Data


Symptoms

Causes

Solutions

43
high level of data disparity
  • lack of integration architecture
  • technology-driven integration
  • inadequate data warehouse
  • absence of data marts
  • unmanaged master data
  • poor integration practices
  • missing/wrong data sources
  • sound integration architecture
  • business-driven integration
  • sound warehousing design
  • targeted data marts
  • master data management
  • integration best practices
  • defined data sourcing criteria
44
overlapping and conflicting data
45
untraceable data
46
enterprise reporting difficulty
47
application integration difficulty
48
complex system interfaces
49
limited data sharing
50
spreadsheet proliferation

Accessing Data


Symptoms

Causes

Solutions

51
hard to find needed data
  • missing metadata
  • inadequate data access tools
  • insufficient indexing
  • inadequate search capability
  • lack of content management
  • poor user interface
  • excessive downtime
  • database design not user
  • friendly
  • ineffective performance tuning
  • ineffective security processes
  • robust metadata
  • user-friendly tools and interfaces
  • indexing and searching
  • data access portals
  • service level agreements
  • service level accountability
  • published service level metrics
  • security policies & procedures
  • periodic security/privacy audits
  • security/privacy accountability
52
can’t access needed data
53
data privacy compromised
54
data security compromised
55
data privacy compromised
56
needed access not authorized
57
obsolete permissions still active
58
data not available when needed
59
hard-to-navigate databases
60
poor data access performance
61
spreadsheet proliferation

Managing Metadata


Symptoms

Causes

Solutions

62
missing documentation
  • casual metadata management
  • fragmented metadata tools
  • lack of documentation
  • standards
  • lack of data modeling
  • standards
  • undocumented changes
  • no documentation incentives
  • no documentation reviews
  • “rush to production” projects
  • metadata templates & guidelines
  • project metadata standards
  • maintenance metadata standards
  • metadata registries & portals
  • metadata system-of-record
  • metadata accountability
  • metadata tasks in project plans
  • incentives and reviews
63
incomplete documentation
64
conflicting documentation
65
hard to find documentation
66
outdated documentation
67
confusing documentation
68
untraceable data
69
misunderstood data
70
high level of data disparity
71
high level of data redundancy

Administering Databases


Symptoms

Causes

Solutions

72
insufficient data storage capacity
  • ineffective storage
  • management
  • passive growth management
  • ineffective performance tuning
  • unscheduled maintenance
  • inadequate database
  • connectivity
  • outdated DBMS versions
  • insufficient backup & recovery
  • continuous capacity planning
  • proactive growth management
  • performance SLAs
  • availability/uptime SLAs
  • connection protocol standards
  • connectivity SLAs
  • routine DBMS upgrades
  • backup & recovery practices
73
unanticipated growth problems
74
poor query performance
75
poor update performance
76
excessive database downtime
77
unreliable database connections
78
needed features not implemented
79
corrupted data can’t be repaired
80
lost data can’t be recovered

Managing Systems


Symptoms

Causes

Solutions

81
complex system interfaces
  • lack of data sharing
  • architecture
  • lack of integration architecture
  • poor application design
  • “quick fix” maintenance
  • “misfit” acquired systems
  • inconsistent data formats
  • limited reuse of data functions
  • testing with production data
  • application architecture standards
  • application design review
  • maintenance & testing standards
  • application acquisition guidelines
  • data sharing incentives
  • database wrappers
  • SOA-based access & update
  • reusable data quality rules
  • managed test data
82
untraceable data
83
poor application performance
84
high level of data redundancy
85
high level of data disparity
86
poor data quality
87
inadequate metadata
88
data security compromised
89
data privacy compromised
90
business rule violations in data

Governing Data


Symptoms

Causes

Solutions

91
poor data quality
  • lack of data management goals
  • unclear, uncertain, ambiguous
  • or misaligned RAA
  • (responsibility, authority, & accountability)
  • poor P&P
  • (policies & procedures)
  • understaffed data management
  • underfunded data management
  • "data as an asset" culture
  • clear data management goals
  • quality RAA + P&P
  • security RAA + P&P
  • privacy RAA + P&P
  • compliance RAA + P&P
  • disaster recovery RAA + P&P
  • designated data ownership
92
data security compromised
93
data privacy compromised
94
data-related compliance violations
95
disaster recovery uncertainties
96
territorialism inhibits data sharing
97
data retention/disposal uncertainty
98
data consolidation difficulties
99
data ownership conflicts
100
need for data standardization

Ultimately, data stewards are problem solvers. In the best of circumstances they can intervene early in data management processes to prevent problems, but realistically they spend much of their time seeking a resolution to problems that already exist. Effective problem solving for data issues depends on diagnosis, causal analysis, and solution planning. This tool is designed with those goals in mind, giving structure to the path from symptoms to causes and from causes to solutions.

Dave Wells

Dave Wells is an advisory consultant, educator, and industry analyst dedicated to building meaningful connections throughout the path from data to business value. He works at the intersection of information...

More About Dave Wells