A disaster recovery test should not begin with someone clicking Restore. It should begin with evidence. Here is a common way these tests go sideways. The team books a window, restores a VM, watches it boot, and calls the test a success. A week later, management asks which restore point was used, whether it met the recovery objective, and whether the application actually worked. The team did not capture the answers, because everyone was focused on the restore itself.
A disaster recovery test is not only about whether a server boots. It is about proving what existed before the test, what you selected to recover, what changed during the test, and whether the result met the recovery objective.
This article is a practical checklist of the evidence to capture before you click Restore.
Key Takeaways
- Capture evidence before the restore begins, not after the test is over.
- A VM that boots is not proof that the application recovered. Validate the application and its dependencies.
- Confirm network isolation before you power on any recovered system.
- Document which restore point you selected, where it came from, and why.
- Verify credentials, encryption keys, and admin access before the test starts.
- Sanitize screenshots, or recreate them in a lab, before you share or publish them.
Why Capture Evidence Before a Disaster Recovery Test?
A baseline tells you what the environment looked like before you touched it. Without it, you cannot tell whether a problem you hit during the test was caused by the recovery or was already present.
Evidence proves the backup existed and was healthy before the test. If a job failed the night before and nobody noticed, you want to know that before you start, not after a failed restore.
Captured state supports audit and management review. When someone asks whether the recovery met the objective, you answer with records instead of memory.
A baseline makes rollback easier. If you need to undo a change, you already know what the original configuration was.
Evidence reduces confusion during troubleshooting. When three people are staring at a recovered system that will not start a service, the pre-test notes tell you what normal looked like.
Capturing the environment first also helps you avoid accidental production impact. Writing down the production IP addresses, hostnames, and DNS records before the test forces you to plan around them.
Finally, a clean set of evidence becomes a reusable test record. The next test starts from a documented baseline instead of a blank page.
Capture the Environment Inventory
Before the test, capture the current state of the environment you are protecting. Screenshot or export the following:
- Hypervisor or cloud platform dashboard, for a top-level view of the environment as it stood before the test.
- Cluster, host, or node summary.
- VM inventory for the protected workloads.
- VM configuration: vCPU count, memory, disk layout and sizes, and network adapters.
- Storage or datastore layout, including where the protected VMs live.
- Restore target capacity, if the test will write recovered systems to a datastore, volume, cluster, or cloud storage target.
- Network names, VLANs, subnets, port groups, bridges, or virtual switches.
- Critical application dependencies, so the recovery order is clear later.
- Current power state of each protected workload.
The labels differ by platform. On VMware vSphere you are looking at clusters, hosts, datastores, and port groups. On Hyper-V you have hosts, virtual switches, and VHDX files. On Proxmox VE you have nodes, storage, and Linux bridges. On Nutanix AHV you have a cluster, hosts, and AHV networks. For cloud workloads, capture the equivalent: instance configuration, attached storage, and the virtual network or subnet.
Capture the view that matches your platform. The goal is a clear record of what existed, not a product tutorial.



Capture the Backup Platform State
Now capture the state of the backup system itself. This is the evidence that your recovery source was healthy before you started.
- Backup server or console dashboard.
- The list of protected workloads.
- Backup job configuration for the workloads in scope.
- The last successful run for each job.
- Recent job warnings and errors, so you know the history.
- The restore point list for each workload you plan to recover.
- Repository capacity and health, so you know the backup source is usable and not already under capacity or retention pressure.
- Immutability status, if your repository uses it.
- Encryption status, if backups are encrypted.
- Backup copy or offsite copy status, so you know which copies exist.
- A description of which credentials or service accounts the backup system uses, without exposing the passwords.
Veeam is a common example here, but the same evidence applies to any backup platform. Whatever product you use, you want proof that the job ran, the restore points exist, the repository is healthy, and any immutability or encryption is in the expected state.





Document the Restore Point Selection
The restore point you choose decides what you actually recover. Record it precisely:
- Restore point date and time.
- The backup job that produced it.
- The workload name.
- Application consistency status, if known, such as application consistent or crash consistent.
- Where the restore point lives: local, offsite, immutable, a backup copy, replicated, or archived.
- Your RPO target compared with the age of the selected restore point.
After the test, you should be able to answer one question without hesitation: did we recover from the restore point we intended to use? If you cannot answer that, the test result is hard to trust.


Document Recovery Objectives and Success Criteria
Write down what success looks like before the test, not after.
- RTO: how long the recovery is allowed to take.
- RPO: how much data loss is acceptable.
- Systems in scope.
- Systems out of scope.
- Expected recovery order.
- Expected validation steps.
- Who signs off on success.
- What counts as a failed test.
Example success criteria
- The VM boots.
- Services start.
- Application login works.
- The data timestamp is acceptable against the RPO.
- DNS resolves correctly inside the test environment.
- Users, or designated test users, can reach the restored service.
- No production network conflict occurs.
A disaster recovery test is not successful just because a VM powers on. A booted VM with a dead application, a stale data set, or a broken dependency is a failed test that looks like a passing one. Decide in advance which of the criteria above must be met.
Capture Application and Infrastructure Dependencies
Many disaster recovery tests fail because the restored workload depends on something that is not in the test. The server itself is fine. What it needs to talk to is missing.
Map the dependencies before the test:
- Active Directory
- DNS
- DHCP
- NTP
- Certificate services
- File shares
- SQL or other database servers
- Application servers
- Web servers
- Load balancers
- License servers
- SMTP relay
- External APIs or integrations
- Firewall and NAT rules
A restored application server may boot successfully and still fail if it cannot reach DNS, authenticate against Active Directory, connect to its database server, or check out a license.
Time matters too. In Active Directory environments, Kerberos authentication depends on synchronized clocks. If the test network cannot reach a valid time source and the recovered systems drift too far, authentication can fail.
Decide how each dependency will be satisfied in the test: recovered alongside the workload, stubbed with a test service, or provided by an isolated copy.



Capture the Test Network Design
Network isolation is one of the most important pre-test items. Get it wrong and the test can affect production. Review this section carefully before any recovered system is powered on.
Capture:
- Test network name.
- VLAN or subnet for the test.
- The IP addressing plan for recovered systems.
- Gateway behavior: whether the test network has a gateway, and where it routes.
- The DNS override strategy for the test.
- Firewall isolation rules.
- NAT rules, if the test needs limited outbound access.
- Whether restored systems can reach production.
- Whether production systems can reach restored systems.
- How you will avoid duplicate hostnames and duplicate IP addresses.
Keep the test isolated
Do not accidentally connect restored systems to the production network unless the recovery plan explicitly requires it and the risks have been reviewed and approved. An isolated test network, sometimes called a bubble or fenced network, lets recovered systems run with their original IP addresses and hostnames without colliding with production.
Why isolation matters
Duplicate IP addresses cause immediate problems. If a recovered server uses the same IP address as a live production server, the result is an address conflict and unreliable connectivity.
Duplicate hostnames cause confusion and can break authentication and certificates.
A recovered domain controller deserves extra caution. If it is restored incorrectly or allowed to communicate with production domain controllers during a test, you can create Active Directory replication or recovery problems. Keep domain controller recovery tests isolated unless you are following a validated forest recovery or domain controller recovery procedure.
Production DNS registration is another trap. A recovered system that can reach production DNS may create conflicting or misleading records and send clients to the wrong host. Plan a DNS override inside the test network so name resolution stays contained.




Validate Access Before the Test
A recovery test can fail before it starts if the one person with the encryption key, the backup console, or the firewall access is unavailable. Confirm access ahead of time.
Document who has access to:
- The backup console.
- The hypervisor or cloud console.
- Domain admin or break-glass credentials.
- Encryption keys, or the key custodian.
- Repository access.
- Firewall or network changes.
- DNS changes.
- Application admin consoles.
- Monitoring tools.
Then test that the access actually works before the window opens. A documented account that nobody can log into is not access.
Security note: do not screenshot or publish passwords, tokens, secrets, license keys, customer names, public IP addresses, private infrastructure details, or sensitive diagrams unless they are fully redacted. If a screenshot cannot be safely redacted, recreate it in a lab or omit it. The evidence you keep should prove the test happened correctly, not expose how to break into the environment.
What Not to Capture
Some evidence should not be included in a public article, shared report, or customer-facing document unless it is fully redacted. Do not publish the following without sanitation:
- Password fields
- API tokens or secrets
- License keys
- Public IP addresses
- Customer names
- Internal domain names
- Full firewall rule bases
- Backup repository paths
- Service account names
- Cloud subscription IDs or tenant IDs
If a screenshot cannot be safely sanitized, recreate the view in a lab or describe the evidence in text.
Create an Evidence Folder
Decide where the evidence goes before the test, not while you are scrambling during it. A simple, consistent folder structure works:
- 00-Plan
- 01-Before-Test
- 02-Backup-Evidence
- 03-Network-Evidence
- 04-During-Test
- 05-Validation
- 06-After-Test
- 07-Lessons-Learned
Name files consistently so they sort and make sense later. A date prefix plus a short description works well:
- YYYY-MM-DD-before-backup-dashboard.png
- YYYY-MM-DD-restore-point-selected-fileserver01.png
- YYYY-MM-DD-test-network-firewall-rules.png
- YYYY-MM-DD-application-login-validation.png
Consistent names turn a pile of screenshots into a usable record. Six months from now, the file name should tell you what it shows without opening it.
Final Pre-Test Checklist
Use this checklist before you click Restore. It stands on its own.
| Item | Evidence to Capture | Complete |
| Recovery scope approved | Written scope listing systems in and out, approved by the owner | [ ] |
| Success criteria documented | RTO, RPO, and pass or fail criteria recorded | [ ] |
| Backup job completed successfully | Screenshot of the last successful run for each workload | [ ] |
| Restore point selected | Restore point date, time, and source job recorded | [ ] |
| Backup repository health checked | Repository status, capacity, and warning state captured | [ ] |
| Restore target capacity checked | Datastore, volume, cluster, or cloud target capacity verified | [ ] |
| Immutability verified if applicable | Immutability or lock status captured | [ ] |
| Encryption key access confirmed | Key custodian available and key access tested | [ ] |
| Test network prepared | Isolated network and VLAN or subnet defined and verified | [ ] |
| DNS plan documented | DNS override or test DNS strategy written down | [ ] |
| Firewall rules reviewed | Isolation rules confirmed with no path to production | [ ] |
| Credentials verified | Console, admin, and break-glass access tested | [ ] |
| Application dependencies mapped | Dependency list and recovery order documented | [ ] |
| Stakeholders notified | Notification sent and window confirmed | [ ] |
| Rollback plan documented | Steps to tear down the test environment recorded | [ ] |
| Evidence folder created | Folder structure created and ready to fill | [ ] |
Conclusion
A good disaster recovery test produces confidence, not just screenshots.
The screenshots and notes are proof that the team understood the starting state, followed a plan, protected production, and validated the result. They turn a one-time exercise into something you can repeat and improve.
Capture evidence before the test, validate the application after the restore, and document what you learned before the next outage makes the test real.