Batch Duplication Detection Discussion and Possible Solution

I was visiting a local customer this week and learned they are about the begin a de-duplication process for their contacts within CRM.  I thought we’d discuss one possible solution the cleaning up your contact list.

Finding Duplicates

I will not really be discussing how to find duplicate records since it will depend entirely on the tools and personnel you have available.  Regardless of method, the end result will be a list of contacts that must be manually reviewed to verify duplicity.

Did you say manual review?

Unfortunately, yes I did.  I’ve been in the information technology business for a long time ( 27 years in January ) and I have never seen any type of automated decision process that correctly identified what to keep and what to throw away.  The only reliable and proven method consists of a keyboard and a chair with someone in between.

Helping CRM help you

At this point, we are still working as a developer creating this list of duplicates.  But, we’ll have to involve the user shortly, so let’s begin that process.

Marketing Lists to the rescue

You have got to be kidding me, you are probably saying at this point.  Nope, I think this will work.  Here’s what the developer needs to do:

  1. Create a marketing list ( type of Contact ) to hold the duplicate contacts
  2. As you are running your duplicate verification process, add any duplicates found as Marketing List Members.

Additional information: listmember class, list class.

These two steps will create an interface within CRM that will allow the user to manage the de-duplication process.  Here is what is shown in the CRM user interface:



Performing the cleanup process: Phase I – Advanced Find

The cleanup process is going to consist of two individual phases because the Marketing List Member view does not give us access to the CRM Merge feature, which we will need later on.

So, the first step is actually to create a custom view using Advanced Find.  Follow these steps:

  1. Select Contacts from anywhere in the SiteMap ( left-hand navigation area ).
  2. Change your view to be Active Contacts
  3. Click the Advanced Find button
  4. Click the Show Details button on the Advanced Find query definition toolbar.
  5. Click the Select link and scroll down the the Related section of the drop-down list.
  6. Select Marketing Lists
  7. Select the Name field
  8. Select Equals as the operator
  9. Enter the name of the Marketing List created by your developer.  Your Advanced Find should look like this:


Click the Find button to execute your query.


Performing the cleanup process: Phase II – Merge Records

Once the results are displayed, you can begin the de-duplication process using the CRM Merge process.  In the figure below, you’ll notice that I have two duplicate records selected.  The Merge button is encircled in a red box:


Click the Merge button and the Merge Records dialog will be displayed:


If you’ve never used the Merge Records feature, you’ll find it to be most helpful. Here’s how it works:

  1. Select the Master Record using the radio button beside the Contact name.
    This is the record that will remain active when the Merge Record process has completed.  The other record will be disabled.
  2. Using the attribute list, select the data elements that will be saved to the Master Record.
    Again, using the radio buttons per attribute, select which value will be copied ( or remain in ) the Master Record.
  3. Click OK and the records will be merged.

This process works fine if you have only two records, but if you have more than two, you’ll need to repeat this process multiple times since Merge Records only works with two records at a time.


I hope I showed you something that gets you thinking about your data and it’s consistency.  Operations like the above are quite common after you have imported or migrated data from multiple sources.


Shameless Plug

It just so happens that I offer a commercial solution for locating duplicates, should the above not be an option for you.  It is specifically designed to locate and display duplicate email addresses found across the Lead, Contact, and Account Entities.  While it will not entirely replace the above process, it could go a long way toward helping with your data cleanup and consistancy.

Check it out here.

Leave a Reply 1 comment