Site Recovery Manager
- sathyahraj

- 5 days ago
- 6 min read
What Is SRM?
VMware Site Recovery Manager is an enterprise-grade disaster recovery orchestration tool. It automates the planning, testing, failover, and failback of virtual machine workloads between two vCenter-managed sites: a primary (protected) site and a secondary (recovery) site
Core Components and Architecture
vCenter & SRM Servers at Each Site: SRM runs alongside vCenter on both the protected and recovery sites (as appliances), coordinating actions during recovery events
Replication Mechanisms: Supports VMware vSphere Replication (hypervisor-level) and array-based replication via Storage Replication Adapters (SRAs) or Virtual Volumes
✅ Key Features
1. Automated Orchestration
2. Non‑Disruptive Testing
3. Failback / Planned Migration
4. Flexible Recovery Planning
Allows for granular recovery plans including sequencing, IP/network settings, resource mappings, and custom scripting for post-failover automation.
5. Scalable & Application-Agnostic
Designed to protect thousands of VMs across multiple sites.
Works with any application running in the VMware environment without needing app-specific plugins
6. Support for Advanced Topologies
7. Integration Ecosystem
🧰 Deployment & Licensing
Installation: Deploy SRM as a Photon‑OS-based appliance on both sites. Use the HTML5 Clarity UI for configuration and management
Licensing: SRM is licensed per protected VM, not per CPU. Term and perpetual licenses are supported, and vSphere Replication is included free with vSphere Essentials Plus and above
⚖️ Benefits and Use Cases
Typical Use Cases:
Critical site-level DR and failover
Planned datacenter migrations
Maintenance-based application testing
Hybrid cloud and multi-cloud DR strategies
🧠 Deployment Workflow Summary
1. Designing the SRM Recovery Plan ✅
🔄 Recovery Plan Structure & Workflow
Protection Groups: Organize VMs into groups based on application tiers or data dependencies (e.g. web servers, DB servers). These map to replication sets and specify RTO/RPO requirements.
Recovery Plans: A plan is essentially an automated runbook controlling VM shutdown, replication sync, startup sequence, IP/network customization, and scripts. Multiple plans can reference the same protection groups.
Dependencies & Sequencing: Define inter-VM dependencies (e.g. DC first, then application servers), enabling parallel startup within priority groups for faster recovery.
Pre/Post actions & IP Customization: Customize VM IPs, gateways, run in-guest scripts (DNS updates, services start), and display prompts during recovery execution.
🧪 Testing
🛠️ Execution & Failback
Unplanned Failover: Trigger real recovery plan after a disaster; SRM orchestrates shutdown, final sync, startup, and IP reconfiguration.
Planned Migration / Failback: If the protected site is operational, plan migration can move workloads orderly with minimal data loss. SRM reprotects VMs and reverses direction.
2. Topology Mapping & Resource Alignment
Site Pairing: Pair protected and recovery vCenter servers and their SRM instances via site-pair configuration.
Inventory Mapping: Map folders, resource pools, networks, and datastores between sites to ensure smooth migration. NSX universal logical switches can span L2 networks for seamless failover.
Resource Considerations: Ensure sufficient compute, storage, and network resources at the recovery site. Use few but large datastores and group VMs to minimize recovery latency.
3. Choosing Replication Methods: vSphere vs. Array-Based
🟢 When to Choose:
Array-Based: Ideal for enterprise use cases requiring sub-minute RPOs, write-order consistency globally, large VM scale, and tight SLAs.
vSphere Replication: Best suited for smaller environments, mixed storage, budget-conscious setups, or non-critical workloads.
You can even mix both: use VR for lower-tier VMs, and ABR for critical workloads within the same SRM deployment—just don’t protect the same VM by both mechanisms.
4. Putting It All Together: Architecture Planning
Map required RTO/RPO per application/tier → choose replication accordingly.
Design protection groups aligned to workloads, dependencies, and replication capabilities.
Configure inventory mappings of compute, network, folders, and storage to match planned failover topologies.
Build and test recovery plans:
set sequence, customize IPs
include pre/post scripts
test non-disruptively
Plan execution strategy:
scheduled migrations
failover vs planned migration scenarios
reprotect and failback workflows
Baseline performance with recommended settings (e.g. larger fewer datastores, grouped VM startups) to reduce latency.
5. Best Practices & Practical Tips
Separate large VMs and page files to avoid unnecessary replication load.
Tune bandwidth and replication settings—CBT for VR, compression, network latency considerations.
Use parallel startup within priority groups and minimize protection groups to improve RTO.
Integrate SRM with NSX, vSAN, and VMware Cloud if using hybrid or multi‑site deployments for better automation.
Document recovery plan history, run test reports, and align with compliance or audit requirements.
SRM Topology Explanation

1. Protected (Primary) Site
vCenter Server and SRM appliance manage production workloads.
vSphere Replication appliances or storage arrays (with SRAs) handle replication.
Protected VMs reside in clusters and storage datastores ready for replication.
2. Recovery (Secondary) Site
Mirrored setup with vCenter + SRM appliance.
Placeholder VMs are created in advance to reserve inventory slots.
Replication targets: VR receives VM-level blocks; array-based replication mirrors LUNs/volumes.
3. Replication & Network Links
Network connectivity connects SRM, vSphere Replication services, and SRA ports (e.g. ports 31031, 44046) across sites.
Storage replication occurs either via the hypervisor (vSphere replication) or directly between arrays (ABR).
Replication traffic uses dedicated replication networks for isolation and performance.
4. Inventory & Resource Mapping
Folders, resource pools, datastores, and networks are mapped from the protected site to the recovery site.
NSX or inventory-based network mappings ensure consistent virtual networking and, if used, universal logical switches can allow seamless L2 failover.
5. Recovery Plan Execution Flow
Initiate recovery or planned migration.
Perform final sync (if source still online), shut down VMs.
Recovery site powers on placeholder VMs in defined priority groups with dependencies.
Launch post‑power-on scripts, apply IP changes, reconfigure services.
After recovery or test, cleanup and optionally reprotect and fail back.
🔍 Key Takeaways from the Diagram
Provides a holistic view of components: vCenter servers, SRM appliances, replication layer, VM inventory, networks.
Shows dual replication modes: vSphere Replication (VM-level) vs. Array-Based Replication using SRAs.
Illustrates bi‑directional topology, supporting both planned migrations and failbacks.
Includes network port/service mapping — especially useful for firewall and compliance planning
🧩 Enhancing for Your Environment
You can tailor this layout to various SRM topologies:
Shared Recovery Site: Multiple protected sites mapping into one recovery site (multi-pair SRM)
Stretched Cluster Integration: Combine SRM with vSAN stretched clusters, protecting across metro sites to a third site for ultimate resiliency
NSX-Aware Deployment: Use Cross‑VC NSX logical networks and automated mapping, enabling identical IP addressing and security across sites—ideal for test and DR networks
📝 How to Create Your Own Topology Diagram
Consider the following when building your custom diagram:
Clearly mark vCenter + SRM pairs at each site.
Show replication components: VR appliances and/or array replication adapters.
Annotate network connectivity: control, replication, and VM traffic.
Indicate inventory mappings: network, resource pools, datastores, folder names.
Define placeholder VM logic, recovery priority groups, and sequencing.
Include pre/post script stages, IP customization steps.
Layer in optional components like NSX, stretched clusters, or F5 BIG-IP for routing and DNS failover





Comments