Resilience in Amazon EVS

The AWS global infrastructure is built around AWS Regions and Availability Zones. AWS Regions provide multiple physically separated and isolated Availability Zones, which are connected through low-latency, high-throughput, and highly redundant networking. With Availability Zones, you can design and operate applications and databases that automatically fail over between zones without interruption. Availability Zones are more highly available, fault tolerant, and scalable than traditional single or multiple data center infrastructures.

Amazon EVS environments are available in a single AWS Availability Zone. To ensure high availability of Amazon EVS Single-AZ infrastructure, Amazon EVS offers the following features:

Note

Amazon EVS only supports Single-AZ deployments at this time.

Amazon EVS supports the use of AWS Elastic Disaster Recovery to automate the backup and recovery of your data.
Amazon EVS deploys an Active/Standby NSX Edge cluster with two NSX Edge nodes per VCF requirements. The NSX Edge nodes run on different hosts to ensure high availability and allow for quick failover in the rare event that an NSX Edge node fails.
Amazon EVS deploys a minimal environment of four ESXi hosts, which VCF requires. Additional hosts can be added post-deployment. This is a VMware design requirement to ensure proper vSAN quorum and maintain availability during maintenance operations and host failures. For more information, see vSphere Cluster Design for VMware Cloud Foundation in the VMware Cloud Foundation documentation.
Amazon EVS supports the use of an EC2 partition placement group or cluster placement group for EC2 hosts. The partition placement group spreads your EC2 instances across logical partitions such that groups of instances in one partition do not share the underlying hardware with groups of instances in different partitions. This strategy helps reduce the likelihood of correlated hardware failures for large distributed workloads. Cluster placement groups are used to place your EC2 instances within the same physical rack to ensure low latecy. For more information, see Partition placement groups in the Amazon EC2 User Guide.

For more information about AWS Regions and Availability Zones, see AWS Global Infrastructure.

VMware component resilience

Amazon EVS customers are responsible for configuring the VMware components running on Amazon EVS to ensure high availability of your virtual machines (VMs) and workload resiliency.

Amazon EVS supports the following VMware Cloud Foundation (VCF) resiliency features:

vSphere replication - Provides host-based, asynchronous replication of your VMs for disaster recovery and workload migration purposes. For more information, see How vSphere Replication Works in the VMware vSphere Replication documentation.
vSAN data protection - Enables you to quickly recover VMs from operational failure for ransomware attacks, using native snapshots stored locally on the vSAN cluster. For more information, see Using vSAN Data Protection in the vSAN documentation.
vSphere HA - Provides automatic failover for VMs in the event of a host failure. For more information, see High Availability Design for vCenter Server for VMware Cloud Foundation in the VCF documentation.
vSphere Fault Tolerance (FT) - Provides continuous availability for mission-critical VMs by creating and maintaining another VM that is identical and continuously available to replace it in the event of a failover situation. For more information, see How Fault Tolerance Works in the vSphere documentation.
vSAN Failure to Tolerate (FTT) - A vSAN setting that determines how many host failures a VM can withstand before becoming inaccessible. This defines the level of redundancy and fault tolerance for your virtual machines within the vSAN cluster. For more information, see Tolerate Additional Failures with Fault Domain in vSAN Cluster in the vSAN documentation.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Using service-linked roles

Working with other services