Poor Data Quality is a Full-Blown Crisis: A 2024 Customer Insight Report

ABSTRACT: Despite $180 billion spent on big data tools and technologies, poor data quality remains a significant barrier for businesses, especially in achieving Generative AI goals.
Read time: 6 mins.
It’s no secret that organizations are spending big in an attempt to be data-driven. With nearly $180 billion invested in big data tools and technologies, companies are going all-in.
However, despite these investments, poor data quality, siloed information, and lack of proper governance continue to undermine their efforts. Instead of meeting their AI/ML goals on time, many organizations find themselves drowning in inconsistent, incomplete, and unreliable data—leading to flawed analyses, missed opportunities, and costly mistakes.
A recent report, “Inside 2024's Data Quality Challenge: Insights from the Frontlines," published by WinPure, a data quality management platform highlights some of the key struggles line managers and executives are facing when it comes to resolving data quality issues. The report is a summary of conversations held with 100+ prospectives over the span of a year providing insight into the mounting challenges customers face with their data.
Some critical insights are summarized in this post.
Dirty Data is No Longer a Passing Problem: It’s a Crisis.
Messy, duplicated, and fragmented data has always been a problem for businesses, but it has now spiralled into a full-blown crisis. With AI initiatives gaining momentum and the pace of CRM and ERP migrations accelerating, the dirty data dilemma is a direct threat to business goals and the success of AI projects.
Customers report increased frustration with challenges such as:
Data that has not been cleaned or updated for years.
Duplicated name and address data in the thousands.
Data that is no longer valid or correct, such as people no longer working at an organization.
Variations in business names, personal names, and addresses affect a majority of customers, making up 60% of the data quality challenge organizations experience.
Similarly, duplicate records remain one of the most persistent and costly issues in data management. In CRM systems, duplication rates can reach up to 20%, creating confusion in customer records and impeding operational efficiency. Worse, it also inflates volumes unnecessarily, increases processing times, and introduces errors in reporting and analytics.
Unfortunately, the report also indicates that professionals in 65% of companies still rely on manual methods like Excel to scrub their data, painstakingly going through rows and columns to identify duplicates, errors, and inconsistencies.
Teams spend countless hours applying formulas, highlighting discrepancies, and cross-referencing datasets, often without dedicated tools to streamline the process. This dependence introduces inefficiencies and highlights the gap between modern data challenges and available tools.
Why Has Poor Data Become a Bottleneck?
Because poor data is rarely seen as a threat or a pressing concern until it impacts a critical business function or goal.
For example, one company only noticed they had a dirty data challenge when their direct mail campaign resulted in over 30% return rates, which ballooned into unnecessary wasted spend.
Similarly, another company only found out they had thousands of duplicate records when sales insights did not coincide with revenue projections, leading to confusion during quarterly reporting.
Here’s an example of common data quality issues:
The discrepancies in the data were traced back to duplicate customer entries, which inflated sales activity metrics while misrepresenting actual revenue generated.
It is extremely rare for a business to proactively address data quality challenges without a clear and immediate trigger because most organizations operate with the assumption that their data is "good enough" until proven otherwise.
This perception persists until a significant project or goal highlights the underlying problems, forcing businesses to address what they previously ignored - but by then, the stakes are higher, and the time to fix the issues is limited.
What ensues is a reactive effort to clean up the data, often relying on manual methods or patchwork solutions that address only the immediate need.
Once the project is completed, the focus shifts back to day-to-day operations, and the data quality challenge is pushed to the background once again until the next alarm is triggered. This cycle of neglect and urgency is a common pattern, highlighting why businesses struggle to maintain consistently clean, reliable data.
Biggest Data Quality Struggles
Behind every data quality challenge lies frustration, wasted time, and the inability to fix these issues with efficiency. In analysing conversations with customers, the report identifies the top five common data quality challenges:
1. Duplicate and Inconsistent Data: 70% of customers struggle with matching records due the lack of data matching technologies.
2. Integration Across Multiple Data Sources: 65% of customers need tools that integrate smoothly with their existing systems, like Salesforce and various databases.
3. Lack of Proactive Data Cleaning & Governance: 65% express the need for customizable cleaning processes to standardize their data effectively.
4. Manual, Time-Intensive Processes: 50% of customers need entity-centric matching to link records based on multiple attributes, going beyond simple one-to-one column matching.
5. Compliance and Regulatory Challenges: 40% of customers face difficulties with inconsistent data formats that affect matching accuracy.
These are but just some of the surface-level challenges companies are facing. Deeper issues such as the inability to resolve duplicates efficiently, the lack of budget for setting up MDM infrastructures, and the overwhelming problems with mapping data between internal and external sources for insights remains a critical challenge due to the lack of access to data match technologies that can be safe, secure, and easy-to-use.
What Customers Expect from DQM Solutions
As the data quality challenge grows in scale and complexity, customers are becoming more conscious about what they expect from Data Quality Management (DQM) solutions. They are no longer looking for generic surface-level SaaS tools but instead expect reliable, efficient systems that address their unique challenges.
Key among these is the expectation for solutions that are both powerful and user-friendly, with an easy-to-use interface capable of handling the intricacies of modern data without requiring extensive technical expertise.
Customers value simplicity in design but demand advanced data clean & match capabilities, particularly when solving issues like duplicate records, inconsistent formats, and integrating data from multiple sources.
In addition to functionality, customers emphasize the importance of solutions that provide long-term value. This means tools that not only solve immediate challenges but also support sustainable practices such as ongoing data governance. With increased awareness of the risks posed by poor data quality, many businesses expect DQM tools to assist in implementing clear policies for data maintenance, compliance, and usage.
Ultimately, customers want solutions that don’t just react to data issues but proactively help prevent them, ensuring that their data becomes a reliable asset rather than a persistent problem.
According to the report's analysis, key customer requirements when choosing data quality tools include:
Customizable Tools: The tools that allow tailoring matching rules, workflows, and filters to accommodate unique data structures are in demand.
Ease of Use: Customers want solutions with plug-and-play options that don’t require extensive user training & environment setup.
Automation: 85% of customers express a strong need for automation to reduce the reliance on manual processes.
Scalability: Must have the capacity to handle large datasets and multi-source data integration across departments and regions without performance degradation.
Cost-Effectiveness: Customers prioritize solutions with flexible pricing models that align with their financial limitations and the capability to manage large datasets efficiently without incurring excessive costs.
As simplistic as this sounds, most data management solutions are difficult to use and require extensive skills and training. More importantly, users do not get a holistic experience where they can clean, match, merge/purge, and consolidate records in one platform. Data quality remains a persistent challenge, but it is quite likely that modern technologies are complex and require much more resources to use – thus making it a double-edged sword.
To Conclude – Data Quality is Critical But it Needs an Effective Approach!
The stakes around data quality have become too high for businesses to rely on manual or outdated methods. Organizations that proactively cleanse, consolidate, and standardize their data stand to maximize their investments in CRM systems, AI analytics, and compliance requirements. For companies to take the most out of their data, data quality needs to be an ongoing process, supported by the right tools, the right people, and the right priorities.