Availability components

When creating resources in Azure, we should think back to the cloud computing services model of Infrastructure as a Service (IaaS), Platform as a Service (PaaS), serverless, and Software as a Service (SaaS), with the same principles and approach being applied to the responsibility of the availability of resources. This means we should consider how we will make data, app, compute, and other services available in a failure at the app or virtual machine level, the hardware level, the data center, or the region level.

Depending on the service, Microsoft is responsible for providing a range of availability options, but you are responsible for implementing them to ensure you have designed an availability strategy that meets your Service Level Agreement (SLA) requirements.

Microsoft’s SLAs for its services are based on financial compensation when the defined SLA is not met. As a reference, the Microsoft SLAs for all services can be found at https://azure.microsoft.com/en-gb/support/legal/sla/.

The following diagram aims to summarize these components by looking at virtual machines and storage resources. You will see how you can increase the SLA and durability by using these components:

Figure 3.6 – Azure availability components

A solution will be based on the resources that are to be protected, the nature of the failure to protect the resources from, as well as considering the SLA requirements.

The following are the availability components that Azure can provide to build a solution. These components can be considered as building blocks to use as required:

  • Within a data center component:

Example failure that could occur: Hardware rack failure; power, network.

Availability component: Availability set in the case of a virtual machine or Locally Redundant Storage (LRS) in the case of storage.

  • Within a region component:

Example failure that could occur: Data center failure; power, network, cooling.

Availability component: Availability Zone in the case of a virtual machine or Zone Redundant Storage (ZRS) in the case of storage; synchronous replication is used.

  • Across a region component:

Example failure that could occur: Entire region failure – multiple data centers in that region that suffer failures such as power, network, or cooling.

Availability component: Azure Site Recovery (ASR) in the case of a virtual machine or Geo-Redundant Storage (GRS) in the case of storage; asynchronous replication is used.

In this section, we introduced the availability components of the native Azure platform. In the following section, we look at adopting a risk model.

Adopting a risk model

Much like in the security area, where we take a defense-in-depth approach, we should consider taking a similar approach to implementing availability into a solution. We should understand the possible failures at the level we are concerned with, determine the impact and probability, and then implement the most appropriate solution; this should be driven by determining your required SLA, that is, the uptime required for a resource. The SLA of a resource can be increased by combining different availability components. We look at SLAs in more detail in Chapter 12, Azure Service-Level Agreements.

The following diagram outlines the risk model approach:

Figure 3.7 – Adopting a risk model

In this section, we looked at adopting a risk model. The following section looks at availability sets.