Applying Data Retention to a SQL Table


There are many reasons why you’d need to delete old data from a table within a database, some of which include;

# GDPR (adhering to data retention policies)
Ever increasing tables becoming tougher to manage (e.g. a replication refresh is taking 2 days)
# SQL Server Express limitations (10GB max database size)
# The table is storing application errors & logs and you’ve been told it’s there to stay

Maybe it’s a one-off job for you, and the deletions can be done within a days work. Or, you might find yourself dealing with very large tables, deleting daily during a small off-peak window over the course of however long… this is a task that can come along with many variations, and if asked the question, how would you do it yourself? Well, “it depends!”.

I’ve run through a simple version of the task below, which also summarises a bunch of my other recent tech posts too.  All links below are internal.


Deleting Data in SQL Server

1. Get a hold of a test database and restore it on your SQL Server.

2. Look for one of the larger tables and pick your fancy.

3. Get row counts by date. For this example, I’m storing the row count information into a temp table. This deletion won’t take long so my session will still be active.

4. Perform delete in batches.

5. Get new row counts and compare against previous results.

Plus a few additional checks to verify.

Looking good – All data on orders beyond 7 years (1 January 2012 00:00) has been deleted.

When the initial task is done and all tables are meeting retention periods, we’d then need to consider things like scheduled SQL Agent jobs to maintain the tables every month.



Comments

Leave a Reply

Your email address will not be published. Required fields are marked *