In SQL Server 2008 I have some million rows of data which needs be deleted. They are scattered across a handful of tables. Deletion takes up to 20 seconds which I think is way to slow! The data to be deleted is identified by a timestamp
column. Here is what I have done so far in order to optimize:
- Using isolation level
read uncommitted
. I don't care about transactions. If we fail the user will issue the delete operation again. And new data is ensured not to have thetimestamp
we are deleting. - Deleting leaf tables before parent tables.
- The
timestamp
column is part of the PK clustered index, in fact its the first position of the PK/index. - Each table is emptied using a loop which deletes top 200000 entries in order to reduce the transaction log overhead.
- Neither I/O nor CPU is maxed out on the server
What have I overlooked?
Also I am in doubt of the effect of moving the timestamp column to the first position in the PK. After doing so, must I reorganize the tables or is SQL Server smart enough to do this itself. My understanding of clustered index is that since it defines the physical layout of the rows, it is force into reorganizing the data. But we have no complaints from the customer that the changing clustered index operation took a long time to perform.