How to fine non-matching rows between two tables
Duplicate values are a pretty common problem I run across in databases. There are several precautions you can take to prevent dups in your DB, like using constraints, primary keys, etc. Once duplicates make their way into a table, they can cause all kinds of unexpected/undesirable results. This post is less about prevention though, and more about finding duplicates once they've made their way in. For example purposes I've built a table called "DupExample" that has two columns "Col1" and "Col2" with the following data in it:
If we were wanting to find duplicate values in "Col1" we could do so with the following query.
SELECT d.Col1, COUNT(*) 'NumOfDups' FROM DupExample d GROUP BY d.Col1 HAVING COUNT(d.Col1) > 1
The results of this query will show you the values that are duplicated in "Col1", and how many times they're duplicated, as seen below:
Sometimes the uniqueness of a row in a table is defined by more than one column. In this circumstance we'll need to include the other columns that make up the unique key in our GROUP BY and HAVING clause. If we wanted to find rows that have duplicate values across both "Col1" and "Col2" we could modify our query to the following:
SELECT d.Col1, d.Col2, COUNT(*) 'NumOfDups' FROM DupExample d GROUP BY d.Col1, d.Col2 HAVING COUNT(d.Col1) > 1 AND COUNT(d.Col2) > 1
The results of this query would show us that there is actually a triplicate value in our test data, as seen below.
Col1 Col2 NumOfDups
----------- ------------- -----------
4 d 3
Now that you've figured out which values have been duplicated, your next step should be trying to figure out how the dups are being created, and how you can prevent them in the future (aka fix the problem, not the symptom). Also, as a final word of caution, research your duplicates before you start deleting rows. Do other tables hold a foreign key that points to the duplicate row you're about to delete? Are there constraints in place that will prevent you from deleting a row if it is a FK in another table? (I hope so)
Well, that's a start on finding duplicates within a single table. Hope you found it useful.