Monday, September 20, 2010

SQL Server - Deleting Duplicate Records

An easy method for deleting the duplicate records from a table is by using the Common table expressions.

Let us create a table - State with columns StateID and StateName of data types Int and Varchar(20) respectively.

Create table State (StateID Int,StateName varchar(10))
Go
Insert into State
Select 1, 'XXXX'
Union All
Select 2, 'YYYY'
Union All
Select 1, 'XXXX'
Union All
Select 3, 'ZZZZ'
Union All
Select 2, 'YYYY'
Union All
Select 1, 'XXXX'
Union All
Select 3, 'ZZZZ'
Union All
Select 1, 'XXXX'
Union All
Select 2, 'YYYY'
Union All
Select 2, 'YYYY'
Union All
Select 1, 'XXXX'
Union All
Select 3, 'ZZZZ'

Now if we check the data in the State table, it will return 12 records with duplicate values as shown below

Select * from State 

image 

Now our intention is to delete the duplicate records so that only 3 distinct records must exist.

For this we can use the Common table expression as below.

With DistinctStates(StateID,StateName,RecCount) as
(Select StateId, StateName, row_number() over(Partition by StateId,StateName Order by StateId) RecCount from State)
Delete from DistinctStates where RecCount > 1

Once we execute the above script all the duplicate Records will be deleted from the table resulting 3 distinct records

Select * from State 

image

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.