Restore tests - waste of time or useful procedure?

Scroll

An overlooked procedure that I often experience is the restore test. Do you know how long it takes to restore the critical systems, who does what, is it worth doing the test, and does it really take only 30 minutes as you expect? Read more and become smarter? 😊

As a SQL Consultant at Unit IT, I am in daily contact with our customers, regarding their SQL Servers and their backups. I find that customers are well in control of the backup and in general full, diff, and log backups are regularly taken as needed. However, my experience with database restore tests is unfortunately a bit different - the focus on restoring the large or important databases is simply not there. With us in it-Craft, it seems very natural to do recovery tests, and I would like to provide the following advice and considerations when making a restore test:

  1. Verify that the backup can be restored. Unfortunately, we often see that backup files cannot be used even if a backup has been made. See below how we want to verify that the backup can be used.
  2. Verify the restore procedure. If you need to make a restore due to a crash, it is nice and comfortable to know that you have a tested and proven procedure to make the restore off the database in order to get back to normal operation as soon as possible.
  3. Time the restore time off the most important databases. Does the employee know what to do and is the organization okay whit the restore taking xx time to get these databases back up and running? What is the procedure for this time while restore is running?

Relating to section 3, I often experience that customers expect it to take only 30 minutes to restore a database that contains critical data, but in reality takes three times as long.

As you may feel, I am advocating to have restored tests, so everyone knows who does what when you need to make a restore. Below is my checklist for how a restore test can be made:

  1. Restore database to its full size
  2. Run a DBCC CheckDB on the restored database
  3. Test that you can read data. Try to make SELECT TOP 1000 * against the most important tables and the largest tables in the database to see that data can be read.
  4. Start a test system against the restored database and verify that the test system starts up properly and works as intended.

As an example, here's a script to make the first 3 steps in this recovery test:

USE master
GO
RESTORE DATABASE [AdventureWorks-restore] FROM DISK = 'B:\SQLBackup\AdventureWorks2012.bak' 
WITH MOVE 'AdventureWorks2012' TO 'D:\SQLData\AdventureWorks-restore.mdf',
MOVE 'AdventureWorks2012_log' TO 'L:\SQLLog\AdventureWorks-restore_log.ldf'
GO
DBCC CHECKDB('AdventureWorks-restore') WITH NO_INFOMSGS
GO
USE [AdventureWorks-restore]
SELECT TOP 1000 * FROM Sales.Customer
SELECT TOP 1000 * FROM Sales.SalesOrderDetail
GO

Always remember to use a test environment, NOT the production environment for recovery tests.

Once the systems have fallen apart and the company's critical systems have to be restored, it's extremely important to start a proven routine, where everyone knows what to do and not at least how long it takes, so it's just an easy report where in the process you are when managers ask.

I hope you could use the above advice and tips. If you would like help to have regular recovery tests, please feel free to contact me.

Ring til os