The Deduper application requires a data source to be able to dedupe data. Typically this is a query that outputs the data you want to dedupe. In this short guide, we will be walking you through the process of creating a simple query to find duplicates.
Remember this query can be any query you want to process for duplicates. For this example, we will be deduping our entire database, but you can choose which data to dedupe depending on how you set up your data source.
Step 1: Login into your Social Good Software account and click on the deduper application.
Once you have clicked on the deduper application click on the ODATA Links navigation item on the top right hand side menu.
Then you will see the option to add an ODATA link. Click on the plus option to create a new link with Altru.
You will get a dialog asking you for an ODATA link since we don’t yet have the link. We will click on the external icon to open the Altru information library directly.
Step 2: Create a De-duping query
Clicking on the link will open a new tab with the information library in Altru. Click on the “Add an ad-hoc query” option on the top left-hand side navigation.
This will bring up the option to create a query. We are going to be creating a Constituent query to dedupe constituents.
Note: If you want to dedupe other data types select the right query type, for example, sales query.
Now it’s time to create the query based on our query restrictions.
Let’s review each restriction one at a time.
The type is equal to individual – We only want to dedupe individual records at this time. If you want to dedupe other types such as organizations create a separate query and process those independently from this query.
Date added is before DATE or is blank – Altru has a restriction to only be able to output 15,000 rows at a time with ODATA links. Because of this restriction, we have to create multiple queries. You want to create a query that outputs less than 15,000 rows. In our example, it’s any record created before 1/1/2002. Keep in mind this could be different for your organization.
Let’s now setup the output display fields
Let’s review each output field one at a time.
Lookup ID – Quick handy way to view if the record is a duplicate by just seeing the ID.
Email Address – Easily see if the email can provide a contextual action
First name – View the first name
Last name – View the last name
Address – View the address
Constituent record – We need this to be able to merge records together
WARNING: The Deduper only works with one type of column. Meaning you can’t have two columns that are the same such as Date changed, Lookup ID, etc.
Now click on the Preview Tab on the top navigation.
You will want to make sure the total result is less than 15,000 rows.
Don’t worry about not having all the records in this query, we will come back and create more queries for the rest of the records. Once you are happy with the output click on the “Set save options”
Make sure you check the “Suppress duplicate rows” checkbox. Don’t worry if you see multiple rows the Deduper application knows how to deal with duplicates.
Double check the output of the query is less than 15,000 rows.
If you see more than 15,000 rows go back and adjust the date to output less than 15,000 rows.
Once you have verified you have less than 15,000 rows then name and save your query.
Name: Duplicate Constituent Records Query (BOT – 2002)
Description: Query to output records created from the beginning of time to 01/01/2002
Step 3: Duplicate Query with revised dates
Once you have your query created, duplicate the query and adjust the dates for the next 15,000 rows in your database by clicking on the copy button.
This will bring up a screen identical to the creation query screen. You will have to change the date range selection and rename the query to reflect the new dates. Select the date restriction and use the Between operator to select two date ranges.
You might be able to select more years depending on the output. For example, if your output is only a few hundred rows you can select a wider range of dates. If you are getting too many rows, you might have to restrict the date range between months and years.
Click on the “Set save options” and save your query by changing the name
Name: Duplicate Constituent Records Query (2002 – 2003)
Description: Query to output records created from 01/01/2002 to 01/01/2003
At this point, you will want to repeat the process in step three until you have done your entire database. This will equal a group of queries. Here is how the process works.
You will end up with a group of queries. You now have the ODATA queries required for the Deduper application. You can now at this point follow the guide on “How to set up ODATA links with Deduper”.