So in my day job I was asked to investigate why some data was not matching up. After a bit of investigation I found a table that had 90-95% of it’s records duplicated, so I needed a way of removing them quickly. That’s where the following code snippet came in handy:
1 2 3 4 5 6 7 8 9 10 |
;WITH Duplicates AS ( SELECT FirstCol, SecondCol, ROW_NUMBER() OVER(PARTITION BY FirstCol, SecondCol ORDER BY DateValue) AS RN FROM MyTable ) DELETE Duplicates WHERE RN > 1 |
It’s a simple CTE that assigns a row number to each record. We can then issue the DELETE command to any row number over 1. If you have a DATETIME column, you could keep the latest value if you wanted.
Leave a Comment