Introducing the Data Deduplication 101 Blog Series
Duplicate records is a data quality issue that every marketing ops, sales ops, and marketing database administrator is familiar with and has experienced first hand. Data deduplication is simple in concept but can be quite complex in execution, especially when dealing with records distributed across multiple systems. We can’t find any good detailed how-to material on deduplication, so we decided to write down what we know and share the knowledge.
We’ll be publishing a series of blog posts on this topic over the course of the next few months, covering all aspects of what you need to think about before, during, and after a data deduplication project. We’ll cover people, process, and system level issues, and the different approaches to de-duplicating your marketing and sales data.
List of Upcoming Data Deduplication Topics
Here’s a list of upcoming blog topics in this series. If there are specific topics you want us to cover that is not on this list, please let us know and we can add to it.
- Why do you need to dedupe?
- How people and process impact your data deduplication approach
- Pre-dedupe checklist
- Identifying duplicates
- Determining which duplicate record to keep
- Merging duplicate records
- Manual reviewing and overwriting best practices
- Legitimate dupes
- System level considerations
- How to deal with bad data that can prevent merging
- Considering manual/batch vs. automated/continuous approaches
- Data deduplication prevention tactics and tools
To start off the series we’ll cover two topics in this first post.
Where Do Dupes Come From?
Data deduplication is not a one-time exercise. An effective dedupe program is implemented as a continuous running program with real-time technologies, and usually in a few different forms. This is because duplicate records can trickle in from multiple sources. Here are the usual suspect:
- Sales team adding new leads and contacts to CRM without checking existing data
- Sales team bulk loading lists they have acquired outside of marketing
- Marketing team loading lists
- Marketing technologies inserting data
- Broken synchronization between systems that leave stranded records
Why Do You Need to Dedupe?
Few people we know, but we do know a few who shall remain nameless, do deduping for the fun of it. By the way, if you do, we are starting a support group called “Dedupe Anonymous” that you are welcome to join, and we would love to talk to you about working for Openprise. 🙂 Most people spend effort and money on deduping for good business reasons and here are the top ones we see.
Multiple Reps Calling on the Same Account and Lead
This is probably the number one driver cited by our customers for data deduplication. When you have duplicate accounts, contacts, and leads, you can easily end up in a situation with multiple account reps and sales development reps calling on the same account. This problem is more acute the larger your sales team is and if you use a round-robin type of system for distributing new leads. You can end up with multiple reps working on the same account for an extended period of time that can result in poor customer experience, sub-optimal account engagement, and commission dispute. Having an SDR calling on an existing customer can also make your company appear clueless and sloppy.
Link Free Trial Users to Other Program Leads
Many software and consumer service products offer a free trial, so anybody can self-service sign-up and kick the tires. This is a proven lead generation tactic that can be extremely productive for the right product and buyer person. Whether the trial signup process is handled by your product or by your marketing automation platform, you can easily generate a massive amount of duplicate leads from the trial program. If your trial has a time constraint, you likely have leads that sign up multiple times using different email addresses.Trial users are valuable “mid-funnel” leads that you need to maximize your conversion rate. In order to maximize conversion, you need to correlate trial user’s activities across all the records and programs, which means you must dedupe.
Automate Sales, Marketing, and Fulfillment Processes
There are many good reasons why you need to automate your sales, marketing, and fulfillment processes. There are plenty of great software solutions that can help you automate the workflow and transactions across different systems and departments. However, automating business processes when your database has a large number of duplicates can cause more trouble and remediation than the benefits and savings from automating the processes in the first place. Duplicate records can cause duplicate transactions and processes, creating confusing and repetitive touchpoints with the customer and propagate the duplicate data into your finance, order management, and help desk systems.
Save Money and Improve Performance of Marketing Automation Platforms
Most marketing automation platforms are priced according to the size of the database in terms of the number of records. A large number of duplicates directly costs you money in terms of the license fee you pay. If your duplicate count is especially large, like over 60% of your database, this can also significantly degrade the performance of your marketing automation platform. Processes that used to be “real-time” may lag significantly, creating issues with the Service Level Agreement you have between the Marketing and Sales organizations.
Two Types of Dupes
There are two types of data duplication problems: duplicate records and duplicate data fields. Duplicate records occur when you have more than one record of the same contact or account. Duplicate data fields occur when you have multiple fields of the same data from different sources within the same record, such as multiple job titles, phone numbers, and company size. This blog series is about the duplicate record problem. The duplicate data field problem is every bit as complex and is becoming more common. We will cover the duplicate data field challenge in the future. In the meantime, if you wish to read more about the duplicate field challenge, see this article we wrote for MarketingProfs:
We hope you enjoy this new series on data deduplication and we would love to hear your feedback and war stories.