Availability Zones

Some regions are further divided into Availability Zones, although they are not available in all regions, and not all resources support Availability Zones.

The purpose of an Availability Zone is to provide redundancy within a region, that is, to protect against a data center failure, so that a failure related to something such as power, cooling, or connectivity does not impact operations of any resources running in a single data center.

Note that Availability Zones do not protect resources from a region failure, only a data center failure; protecting a virtual machine from an in-rack failure, for example, within a data center, is the scope of availability sets, which we looked at in the previous section.

The following diagram outlines the Azure Availability Zones topology:

Figure 3.11 – Azure availability zones

The availability of resources running in a zone is provided by each zone being protected using independent power, cooling, and networking from other zones. A service outage in one zone will not affect the availability of resources running in different zones.

A higher-level SLA can be provided to you by Microsoft when you choose to use Availability Zones for your deployed resources for those that support this functionality and regions where this is available. In the example of providing availability for virtual machines, a 99.99% uptime is guaranteed by Microsoft if a minimum of two virtual machines is deployed into two or more zones; there is synchronous replication of the virtual machines, which is automatically taken care of by Microsoft.

Without an Availability Zone, should a data center have a failure where your resource is running, your resource will be offline until that data center is brought back online and services are restored.

Availability Zones allow resources, connectivity, and traffic to remain within the primary region but are hosted within a different physically isolated and distanced set of data center buildings.

To protect against an entire region becoming unavailable, resources can be replicated to a secondary region to protect against a service failure within the primary region where the resources are running. The issue may be that you do not want or cannot have your services failing over or running from another region; this may be for reasons such as compliance, latency, or connectivity. You may rely on VPNs or ExpressRoute that are not configured to be available in the secondary region. There may also be many other dependencies that mean having another region as a redundancy option is not viable.

Azure resources are placed into one of three categories as outlined here:

  • Zonal services: Provide the ability to select an Availability Zone for the resources; resources can be pinned to a specific zone based on your needs, performance, or latency, for example.
  • Zone-redundant services: The replication of resources across zones is automatic, and you cannot define the replication settings of how the resources are distributed across the zones.
  • Non-regional services: Services are available in all geographies and are not affected by zone-wide or region-wide outages.

This section looked at Availability Zones. The following section looks at proximity placement groups.