Real-Time de-duplication for better customer experience for Insurance
A leading financial services company partnered with Agile Lab to resolve critical data inconsistencies caused by fragmented data silos. By implementing a real-time deduplication engine and establishing a "Golden Record" for each customer, the company created a single source of truth that transformed its operational efficiency and customer engagement capabilities.
Customer Context
Our client, a Financial Services company, grappled with significant challenges stemming from fragmented customer data. Their data was spread across numerous disconnected systems and silos, creating a disjointed and often contradictory view of their customer base.
This lack of a unified customer profile directly impacted their ability to deliver a consistent and effective customer experience across various touchpoints. Furthermore, it hindered critical business functions such as real-time personalization, accurate analytics, and proactive engagement, while also exposing the organization to operational inefficiencies and compliance risks.
The Challenge
This client faced several challenges related to customer data accuracy and consistency that impaired an effective customer experience. The data were collected in different data silos, giving an overall disjointed customer view.
Customer data was fragmented across multiple systems, leading to operational inefficiencies and a lack of a full 360°View. Inconsistent customer experiences across touchpoints, with the presence of duplicate, outdated, and inconsistent records across databases, lead to inaccurate analytics and compliance risks, with the inability to enforce data governance policies, such as GDPR and data retention standards.
Furthermore, the lack of real-time processing & decision support caused delays in insights generation, making it impossible to support fast personalization, fraud detection, and dynamic customer segmentation. The absence of event-driven architectures or stream processing combined with no Single Source of Truth (SSOT) providing data consistency, integrity, and lineage prevented proactive customer engagement.
The Solution
Our key goal was to establish the Golden Record: one Customer, one Identity. To get there, however, required a couple of steps involving de-duplication and duplicate checks.
1. Data De-Duplication
The foundation of a strong customer data platform starts with the process of immediate de-duplication. Going over the entire database, from website data to 3rd party tools and everything in between, was crucial.
This process also had to be ensured for the future.
2. NRT Duplicate Checks
Every time a new customer record enters the system — whether from a website sign-up, a purchase, or a support request — the data processing engine automatically checks for duplicates in near real-time (NRT). No more sending duplicate communications or conflicting promotions, enabling a reduction in operational inefficiencies caused by redundant records and faster, more personalized customer interactions without unnecessary confusion.
3. Establishing the Golden Record
The golden record represented the consolidation of all the available data points across multiple systems into a unified and trusted representation of the customer, ensuring the most up-to-date Customer view as a living and breathing profile.
4. Offload Paradigm
Our proposed solution was an offload paradigm based on CDC (Change Data Capture) mutations (insert, update, delete) that were streamed within dedicated Kafka topics.
The reconstruction layer extracted the information of interest around the customer, conceived in multiple phases: from cleansing, normalization and harmonization of data, through combining entities required to build a denormalized view of the customer, producing information that are pushed to deduplication service (matching and merging different components), to the actual Golden Record production at the end of processing pipeline.
The foundation of a strong customer data platform starts with the process of immediate de-duplication. Going over the entire database, from website data to 3rd party tools and everything in between, was crucial.
This process also had to be ensured for the future.
Every time a new customer record enters the system — whether from a website sign-up, a purchase, or a support request — the data processing engine automatically checks for duplicates in near real-time (NRT). No more sending duplicate communications or conflicting promotions, enabling a reduction in operational inefficiencies caused by redundant records and faster, more personalized customer interactions without unnecessary confusion.
The golden record represented the consolidation of all the available data points across multiple systems into a unified and trusted representation of the customer, ensuring the most up-to-date Customer view as a living and breathing profile.
Our proposed solution was an offload paradigm based on CDC (Change Data Capture) mutations (insert, update, delete) that were streamed within dedicated Kafka topics.
The reconstruction layer extracted the information of interest around the customer, conceived in multiple phases: from cleansing, normalization and harmonization of data, through combining entities required to build a denormalized view of the customer, producing information that are pushed to deduplication service (matching and merging different components), to the actual Golden Record production at the end of processing pipeline.
Powering Digital Transformation through Data Platform Enablement



Data-driven organizations are three times more likely to report significant improvements in decision-making speed, helping them to respond faster to market changes
(Source: HARVARD BUSINESS SCHOOL)
Data Platforms can allow companies to realize cost savings of up to 15% through minimized redundancies, optimized resource utilization and streamlined processes.
(Source: McKinsey&Company)
Companies focusing on structured data management can improve data accuracy and consistency by 10-20% through centralized data platforms
(Source: McKinsey&Company)
Real-World Impact and Benefits
The project resulted in some key benefits, cascading from a strong data platform foundation
Operational Area | Before Implementation | After Implementation |
---|---|---|
Customer Data View | Customer data was fragmented across multiple silos, creating a disjointed and inconsistent view. | A unified "Golden Record" provides a single, authoritative, and trusted 360° view of each customer. |
Data Quality & Consistency | Databases were filled with duplicate, outdated, and inconsistent records, leading to inaccuracies. | A real-time deduplication engine cleanses, normalizes, and harmonizes data, ensuring a single, accurate dataset. |
Customer Experience | An inconsistent experience across touchpoints, with an inability to support fast personalization. | Enables faster, more personalized interactions and proactive customer engagement without confusion. |
Analytics & Decision-Making | Inaccurate analytics and delays in generating insights hindered effective and timely decision-making. | Optimized decision-making, precise customer segmentation, and predictive analytics are powered by a reliable dataset. |
Data Architecture | Lacked a Single Source of Truth, real-time processing, and an event-driven architecture. | An event-driven architecture using CDC provides a living, breathing customer profile updated in near real-time. |
Governance & Compliance | Inability to effectively enforce data governance policies like GDPR, creating significant compliance risks. | Enhanced data integrity and a unified customer record establish a foundation for strong data governance and compliance. |
Conclusions
By integrating real-time deduplication, a golden record strategy, and continuous data synchronization, the solution establishes a unified and authoritative view of customer data across all business functions. This ensures that marketing, sales, support, and analytics operate on a single, consistent, and accurate dataset, eliminating discrepancies caused by fragmented or redundant records.
Beyond data cleansing, this approach enhances data integrity, optimizes decision-making, and enables more precise customer segmentation, predictive analytics, and personalized engagement. Ultimately, it transforms the data architecture into a strategic asset, driving operational efficiency and improving the overall customer experience.