Backup Verification and Restore Testing for MSPs
How to verify that backups actually work and prove it with documented restore tests. Covers automated verification, manual testing cadence, and what to document for auditors.
Workflow guide · Updated Feb 2026
Contents
- 1.Verification Is Not Testing
- 2.What Automated Verification Catches
- 3.Define your restore testing cadence
- 4.Perform the restore in an isolated environment
- 5.Verify application functionality
- 6.Document actual RPO and RTO
- 7.Store results and report to the client
- 8.Screenshot verification is not enough
- 9.How long does a restore test take?
- 10.Should MSPs charge clients for restore tests?
- 11.What if a restore test fails?
Verification Is Not Testing
What Automated Verification Catches
Define your restore testing cadence
Test frequency should match system criticality. Critical servers (domain controllers, database servers, file servers with active data) should be tested quarterly. Non-critical servers and workstations can be tested semi-annually. SaaS data restore tests (recovering a mailbox or SharePoint site) should run quarterly. Put the schedule in your PSA as recurring tickets with assigned owners. If restore tests don't have a scheduled date and an owner, they won't happen.
Perform the restore in an isolated environment
Never restore test data into the production environment. Use a sandbox VM, a spare physical machine, or a cloud-based test environment. The goal is to validate recovery without any risk of overwriting production data. For BDR appliances, use the local virtualization feature to spin up the backup as a VM on the appliance itself. For cloud-based backups, restore to a temporary cloud VM.
Verify application functionality
Booting to a login screen is not a successful restore test. Log in. Open the critical applications. Verify that the database responds to queries. Confirm that file shares are accessible. Check that services are running. If the client has a specific LOB application, open it and confirm it loads data. Document what you tested and what worked. If something didn't work, document that too and include it in the remediation plan.
Document actual RPO and RTO
Record two numbers: the age of the data at the time of restore (actual RPO) and the elapsed time from "restore initiated" to "system usable" (actual RTO). Compare these to the targets defined in the client's service agreement. If actual RTO exceeds the target, investigate: is the backup medium too slow? Is the restore process adding unexpected steps? Is the target unrealistic for the current infrastructure? Adjust either the process or the target.
Store results and report to the client
Save restore test results in the client's documentation alongside their backup configuration. Include the date, the system tested, what was verified, the actual RPO and RTO, and any issues found. Include restore test results in the next quarterly business review. This is also the evidence cyber insurance providers and auditors request. Having documented, dated restore tests on file significantly strengthens the client's compliance posture.
Screenshot verification is not enough
A screenshot showing a Windows login screen proves the OS can boot. It does not prove the application data is intact, the database is consistent, or the system can be recovered within the client's RTO. Screenshot verification is a useful automated check, not a substitute for a real restore test.
How long does a restore test take?
+Plan 30 minutes to 2 hours per system depending on the backup size and recovery method. A local BDR appliance restore is faster (15 to 30 minutes to boot). A cloud restore depends on download speed. The documentation and reporting takes another 15 to 30 minutes. Budget accordingly.
Should MSPs charge clients for restore tests?
+Include a defined number of restore tests per year in the service agreement (typically 4 for critical systems, 2 for non-critical). Tests beyond the included count are billable. This ensures testing happens while preventing scope creep.
What if a restore test fails?
+A failed restore test is a finding, not a crisis (assuming you still have the backup data). Document the failure, diagnose the root cause, remediate, and retest. Common causes: corrupted backup chain, missing drivers for the restore target, expired credentials in the backup configuration, or application dependencies that weren't backed up.