So in my day job I was asked to investigate why some data was not matching up. After a bit of investigation I found a table that had 90-95% of it’s records duplicated, so I needed a way of removing them quickly. That’s where the following code snippet came in handy:
;WITH Duplicates AS (
ROW_NUMBER() OVER(PARTITION BY FirstCol, SecondCol ORDER BY DateValue) AS RN
WHERE RN > 1
It’s a simple CTE that assigns a row number to each record. We can then issue the DELETE command to any row number over 1. If you have a DATETIME column, you could keep the latest value if you wanted.