Menu
Exchange Server on AWS
Quick Start Reference Deployment Guide

Designing for High Availability

Regions and Availability Zones

You can provision instances in multiple geographic locations called regions. You can launch Amazon EC2 instances in these regions so your instances are closer to your customers. For example, you might want to launch instances in Europe to be closer to your European customers or to help meet your legal requirements.

Each region includes Availability Zones. Availability Zones are distinct locations that are engineered to be insulated from failures in other zones. They provide inexpensive, low latency network connectivity to other zones in the same region. By launching instances in separate zones, you can protect your applications from any failures that might affect an entire Availability Zone. To help achieve high availability, design your Exchange Server deployment to span two or more Availability Zones.

Based on the needs of your business, you can choose to design your Exchange Server deployment to span multiple regions as well. However, this is more complex and requires additional networking and security, as well as more thorough testing and continuous monitoring.

Active Directory Domain Services

Active Directory Domain Services (AD DS) is a core component of a Microsoft Exchange Server deployment. Exchange Server is tightly coupled with Active Directory. The Active Directory schema must be extended to support additional attributes for objects in which Exchange Server stores configuration settings. Additionally, Active Directory Site Topology is used for routing internal messages between separate physical locations. Designing your deployment based on the Exchange PA assumes that each data center pair (in other words, each Availability Zone) is represented as an individual Active Directory site.

There are three ways to use AD DS in the AWS cloud:

  • Cloud only – This is the architecture shown earlier in this guide, in Figure 1 and Figure 2. This type of architecture means that your entire Active Directory forest exists only within the AWS cloud. With a cloud-only AD DS architecture, there are no on-premises domain controllers.

  • Traditional hybrid – The hybrid architecture takes advantage of your existing AD DS environment. You can extend your private, on-premises network to AWS so the resources in the cloud can utilize your existing AD infrastructure. We recommend that hybrid architectures utilize domain controllers from your existing AD forest in the AWS cloud. This is primarily recommended to keep your Exchange servers that are deployed in AWS functional and available in the event of an on-premises outage.

  • AD Connector via AWS Directory Service – The AD Connector allows you to provision a Directory Service proxy in the AWS cloud. When you have network connectivity from your AWS VPC to the on-premises environment via VPN or AWS Direct Connect, the AD Connector makes it easy to provision Amazon WorkDocs sites and Amazon WorkSpaces in your existing AD DS environment. However, the AD Connector should not be used in conjunction with AWS-based Exchange servers. Exchange servers must have a low latency connection to a writable domain controller and global catalog server in the same Active Directory site in which they reside. We recommend that you use the cloud only or traditional hybrid models for running AD DS, to ensure that writable domain controllers and global catalogs are available in the same Availability Zones as your Exchange servers.

The Quick Start Reference Deployment for Active Directory Domain Services covers all of our best practices and recommendations for deploying a highly available AD DS environment on AWS. The master AWS CloudFormation template provided in this guide first launches the AD DS Quick Start to provide the foundation for the remaining infrastructure. It's responsible for building the Amazon VPC, public and private subnets, NAT instances and Remote Desktop gateways, and domain controllers in each Availability Zone. We also configure Active Directory Sites and Services as part of this automated deployment. We provision AD sites for each Availability Zone, along with defining objects for each of your Amazon VPC subnets, and mapping them to the appropriate AD site.

Microsoft recommends deploying a ratio of one Active Directory global catalog processor core for every 8 Mailbox role processor cores, assuming domain controllers are running on the x64 (64-bit) Windows platform.

Namespace Design and Planning

Microsoft Exchange Server 2013 includes new functionality that simplifies namespace design. Unlike Exchange Server 2010, Exchange Server 2013 does not require client namespaces to move with the DAG after a failover event. The Client Access role in Exchange Server 2013 proxies requests to the Mailbox server that hosts the active database copy of the user's mailbox, regardless of the Active Directory site in which that server resides. This means that unique namespaces are no longer required for each data center, or in this case, for each Availability Zone.

Based on this new client access architecture, you have two models for namespace design:

  • Unbound namespace – This model uses a unified namespace that provides access to the Exchange Server infrastructure in each Availability Zone. It allows clients to maintain connectivity without the need to use a different namespace in case one Availability Zone becomes unavailable.

  • Bound namespace – This model uses a unique namespace for each physical location. Because Availability Zones are connected via high-speed network links, the bound namespace model typically doesn't provide any benefits in a single-region deployment. You might consider this option for a multi-region deployment, to provide a namespace to clients who may be geographically closer to the infrastructure in their region.


		Unbound Namespace Hosted in Amazon Route 53

Figure 14: Unbound Namespace Hosted in Amazon Route 53

This Quick Start launches a highly available Microsoft Exchange Server infrastructure in a single region across two Availability Zones. With this architecture we recommend a single, unbound namespace (e.g., mail.example.com). After launching the AWS CloudFormation template and creating the stack, you can proceed to implementing your unified unbound namespace configuration.

Figure 14 includes an Amazon Route 53 hosted zone, along with an active-active failover record set and a reverse proxy solution running on the Edge Transport servers. With this architecture, all client protocols are made highly available through a single unbound namespace. Amazon Route 53, reverse proxy, and Edge Transport configuration options will be explained in greater detail later in this guide.

For more details on namespace design and planning, we recommend reading Namespace Planning in Exchange 2013 on the Microsoft Exchange team blog.

Database Availability Groups

A database availability group (DAG) is the component for mailbox database high availability and site resilience built into Microsoft Exchange Server 2013. A DAG is a group of up to 16 servers that host a set of databases. DAGs provide automatic database-level recovery from failures that affect entire servers or individual databases. Any server in a DAG can host a copy of a mailbox database from any other server in the DAG. When a server is added to a DAG, it works with the other servers in the DAG to provide automatic recovery from failures that affect mailbox databases, such as a disk, server, or network failures.

Simple Two-Node DAGs

If you choose not to utilize the Exchange PA for your Exchange Server design, you can implement an architecture similar to the one provided by this Quick Start, which deploys a single, multi-role Exchange Server in each Availability Zone. In this model, you'll have a single DAG, which is stretched across each Availability Zone.

Proper IP addressing needs to be considered when deploying a DAG on AWS. Of course, each DAG member will need a primary IP address for the operating system. Additionally, because Exchange Server DAGs use Windows Server Failover Clustering (WSFC), you must allocate a secondary private IP address to act as the DAG IP address in each Amazon VPC subnet in which the DAG members will reside. DAG IP addresses are used as the Windows Failover Clustering IP Address resource.


   					IP Configuration That Can Be Customized via Template Parameters

Figure 15: IP Configuration That Can Be Customized via Template Parameters

You can assign multiple private IP addresses to an Amazon EC2 instance using a single elastic network interface (ENI). The AWS CloudFormation template provided by this Quick Start supports this configuration.


   					Viewing the DAG IP Addresses from the Exchange Management Shell

Figure 16: Viewing the DAG IP Addresses from the Exchange Management Shell

Figure 16 shows the properties of a DAG that was created after successfully launching this Quick Start. The secondary private IP address for each instance has been statically assigned to the DAG.

Note

Exchange Server 2013 SP1 running on Windows Server 2012 R2 supports DAGs without a cluster Administrative Access Point (AAP). This means that the cluster does not require an IP address resource. Clusters created without an AAP can be created with the New-Cluster cmdlet by setting the AdministrativeAccessPoint property value to None. You can use this cluster setting if you do not want to use a traditional AAP configuration.

DAG Configuration in the Preferred Architecture

Designing your Microsoft Exchange Server 2013 deployment to run on AWS, while also adhering to the Exchange PA, provides a model where you have a minimum of two Exchange servers in each Availability Zone participating in a single DAG. Here are some of the main benefits of this architecture:

  • You get mailbox database high availability within each Availability Zone, as well as across your AWS region.

  • Server load is distributed across a larger number of servers in the event of a failure, which reduces resource utilization on remaining servers.

  • Because each Availability Zone contains at least two Exchange servers, you end up with a minimum of four copies of each database. In this model, traditional backups are not required, because you can implement three highly available (HA) copies, and one lagged copy. This provides you with enough database durability to implement Exchange Native Data Protection (a solution that doesn't use backups), which reduces the total cost of ownership (TCO) of your deployment.

DAG architecture on AWS requires additional planning and implementation steps when the environment utilizes multiple DAG members in each Availability Zone. As mentioned previously, each DAG member should be assigned a secondary private IP address in Amazon EC2, which will ultimately be dedicated to the DAG and Windows Failover Clustering. Because the Amazon EC2 instances cannot share these secondary IP addresses, you can place DAG members in separate Amazon VPC subnets. This will allow you to statically define a DAG IP address for each member in the DAG.

In order to support the IP addressing requirements, each DAG member is placed within its own Amazon VPC subnet, and the secondary private IP address assigned to each associated instance can be used when defining the DAG. This will ensure that each DAG member can successfully bring the defined IP address online when required.

Witness Server Placement

In order to provide seamless automatic failover between Availability Zones, we recommend that you place your witness server(s) in a third Availability Zone. This helps ensure that cluster quorum, and therefore automatic failover, can be maintained and achieved in the event of a complete Availability Zone outage, regardless of which Availability Zone becomes unavailable.


   				Placing the Witness Server in a Third Availability Zone

Figure 17: Placing the Witness Server in a Third Availability Zone

Witness server placement in a third Availability Zone is a design aspect that you must implement manually after launching this Quick Start. By default, this Quick Start will launch in Oregon, which includes three Availability Zones at the time of this writing. You can launch the Quick Start using the default settings, and proceed to implement a witness server in a third Availability Zone if desired.

DAG Network Design

Exchange Server 2010 architectures commonly included multiple network interfaces per mailbox server configured in a DAG. This provided individual network interfaces for client (MAPI) facing networks, along with an isolated and dedicated replication network for database replication.

With fast network throughput becoming commonplace in modern Microsoft Exchange Server designs, and the fact that network interfaces are often only a logical (not physical) separation, Microsoft is moving away from the previous guidance of separating client traffic from replication traffic. This will simplify the management and initial configuration of your Exchange Server deployment on AWS. We recommend that you utilize instances with high network performance or 10 gigabit connectivity in this model.

DAGs are a large topic, and there are many design considerations and operational procedures that you should be familiar with for larger, more complex deployments. We recommend that you consult the Database availability groups page in the Microsoft TechNet Library for a deeper dive into the subject.

Load Balancing Client Access

One of the biggest changes in Exchange Server 2013 was the re-architecture of the Client Access role. Exchange Server 2013 no longer requires session affinity (i.e., sticky sessions) at the load balancing layer. In short, the Client Access role can now proxy client connections to the mailbox server that hosts the active copy of the user's mailbox database. This means you can reliably use layer 4 load balancing, or DNS load balancing, for client connections to your Exchange Server infrastructure.

Amazon Elastic Load Balancing

Elastic Load Balancing automatically distributes incoming application traffic across multiple Amazon EC2 instances in the cloud. It enables you to achieve greater levels of fault tolerance in your applications, seamlessly providing the required amount of load balancing capacity needed to distribute application traffic. The acceptable port listeners for both HTTPS/SSL and HTTP/TCP connections are 25, 80, 443, 465, 587, and 1024-65535. If you need to load balance other mail protocols such as POP or IMAP, see the remaining solutions in this section.

DNS Failover with Amazon Route 53

Amazon Route 53 lets you configure DNS failover in active-active, active-passive, and mixed configurations to improve the availability of your application. When you have more than one resource performing the same function, you can configure Amazon Route 53 to check the health of your resources and respond to DNS queries using only the healthy resources.

The following configurations are commonly used to provide DNS load balancing and/or failover:

  • Active-active failover – Use this configuration when you want all your resources to be available most of the time. When a resource becomes unavailable, Amazon Route 53 can detect that it's unhealthy and stop including it when responding to queries.

  • Active-passive failover – Use this configuration when you want a primary group of resources to be available most of the time, and you want a secondary group of resources to be on standby in case all the primary resources become unavailable. When responding to queries, Amazon Route 53 includes only the healthy primary resources. If all the primary resources are unhealthy, Amazon Route 53 begins to include only the healthy secondary resources in response to DNS queries.

Amazon Route 53 is an affordable and easy way to provide DNS failover and load balancing over the Internet for all external Exchange Server protocols. Keep in mind that you'll want to use a low Time to Live (TTL) value for your DNS record set. In the unlikely event of an Availability Zone outage, clients will be temporarily disconnected until they start resolving the healthy IP addresses. This also requires that you implement high availability for your Exchange databases using a DAG.

Other Load Balancing Options

There are a number of third-party load balancing solutions in the AWS Marketplace. Some examples of commonly used solutions are:

For details and general guidance, we recommend reading Load Balancing in Exchange 2013 on the Microsoft Exchange team blog.