- Introducing Storage Policy-Based Management in a VSAN Environment
- VASA Vendor Provider
- VSAN Storage Providers: Highly Available
- VM Storage Policies
VSAN Storage Providers: Highly Available
You might ask why every ESXi host registers this storage provider. The reason for this is high availability. Should one ESXi host fail, another ESXi host in the cluster can take over the presentation of these VSAN capabilities. If you examine the storage providers shown in Figure 4.6, you will see that only one of the VSAN providers is online. The other storage providers from the other two ESXi hosts in this three-node cluster are in a standby state. Should the storage provider that is currently active go offline or fail for whatever reason (most likely because of a host failure), one of the standby providers will be promoted to active.
There is very little work that a vSphere administrator needs to do with storage providers to create a VSAN cluster. This is simply for your own reference. However, if you do run into a situation where the VSAN capabilities are not surfacing up in the VM storage policies section, it is worth visiting this part of the configuration and verifying that at least one of the storage providers is active. If you have no active storage providers, you will not discover any VSAN capabilities when trying to build a VM storage policy. At this point, as a troubleshooting step, you could consider doing a refresh of the storage providers by clicking on the refresh icon (orange circular arrows) in the storage provider screen.
What should be noted is that the VASA storage providers do not play any role in the data path for VSAN. If storage providers fail, this has no impact on VMs running on the VSAN datastore. The impact of not having a storage provider is lack of visibility into the underlying capabilities, so you will not be able to create new storage policies. However, already running VMs and policies are unaffected.
Changing VM Storage Policy On-the-Fly
Being able to change a VM storage policy on-the-fly is quite a unique aspect of VSAN. We will use an example to explain the concept of how you can change a VM storage policy on-the-fly and how it changes the layout of a VM without impacting the application or the guest operating system running in the VM.
Consider the following scenario, briefly mentioned earlier in the context of stripe width. A vSphere administrator has deployed a VM on a hybrid VSAN configuration with the default VM storage policy, which is that the VM storage objects should have no disk striping and should tolerate one failure. The layout of the VM disk file would look something like Figure 4.7.
Figure 4.7 VSAN policy with the capability number of failures to tolerate = 1
The VM and its associated applications initially appeared to perform satisfactorily with a 100% cache hit rate; however, over time, an increasing number of VMs were added to the VSAN cluster. The vSphere administrator starts to notice that the VM deployed on VSAN is getting a 90% read cache hit rate. This implies that 10% of reads need to be serviced from magnetic disk/capacity tier. At peak time, this VM is doing 2,000 read operations per second. Therefore, there are 200 reads that need to be serviced from magnetic disk (the 10% of reads that are cache misses). The specifications on the magnetic disks imply that each disk can do 150 IOPS, meaning that a single disk cannot service these additional 200 IOPS. To meet the I/O requirements of the VM, the vSphere administrator correctly decides to create a RAID-0 stripe across two disks.
On VSAN, the vSphere administrator has two options to address this.
The first option is to simply modify the VM storage policy currently associated with the VM and add a stripe width requirement to the policy; however, this would change the storage layout of all the other VMs using this same policy.
Another approach is to create a brand-new policy that is identical to the previous policy but has an additional capability for stripe width. This new policy can then be attached to the VM (and VMDKs) suffering from cache misses. Once the new policy is associated with the VM, the administrator can synchronize the new/updated policy with the VM. This can be done immediately, or can be deferred to a maintenance window if necessary. If it is deferred, the VM is shown as noncompliant with its new policy. When the policy change is implemented, VSAN takes care of changing the underlying VM storage layout required to meet the new policy, while the VM is still running without the loss of any failure protection. It does this by mirroring the new storage objects with the additional components (in this case additional RAID-0 stripe width) to the original storage objects.
As seen, the workflow to change the VM storage policy can be done in two ways; either the original current VM storage policy can be edited to include the new capability of a stripe width = 2 or a new VM storage policy can be created that contains the failures to tolerate = 1 and stripe width = 2. The latter is probably more desirable because you may have other VMs using the original policy, and editing that policy will affect all VMs using it. When the new policy is created, this can be associated with the VM and the storage objects in a number of places in the vSphere Web Client. In fact, policies can be changed at the granularity of individual VM storage objects (e.g., VMDK) if necessary.
After making the change, the new components reflecting the new configuration (e.g., a RAID-0 stripe) will enter a state of reconfiguring. This will temporarily build out additional replicas or components, in addition to keeping the original replicas/components, so additional space will be needed on the VSAN datastore to accommodate this on-the-fly change. When the new replicas or components are ready and the configuration is completed, the original replicas/components are discarded.
Note that not all policy changes require the creation of new replicas or components. For example, adding an IOPS limit, or reducing the number of failures to tolerate, or reducing space reservation does not require this. However, in many cases, policy changes will trigger the creation of new replicas or components.
Your VM storage objects may now reflect the changes in the Web Client, for example, a RAID-0 stripe as well as a RAID-1 replica configuration, as shown in Figure 4.8.
Figure 4.8 VSAN RAID-0 and RAID-1 configuration
Compare this to the tasks you may have to perform on many traditional storage arrays to achieve this. It would involve, at the very least, the following:
The migration of VMs from the original datastore.
The decommissioning of the said LUN/volume.
The creation of a new LUN with the new storage requirements (different RAID level).
Possibly the reformatting of the LUN with VMFS in the case of block storage.
Finally, you have to migrate your VMs back to the new datastore.
In the case of VSAN, after the new storage replicas or components have been created and synchronized, the older storage replicas and/or components will be automatically removed. Note that VSAN is capable of striping across disks, disk groups, and hosts when required, as depicted in Figure 4.8, where stripes S1a and S1b are located on the same host but stripes S2a and S2b are located on different hosts. It should also be noted that VSAN can create the new replicas or components without the need to move any data between hosts; in many cases the new components can be instantiated on the same storage on the same host.
We have not shown that there are, of course, additional witness components that could be created with such a change to the configuration. For a VM to continue to access all its components, a full replica copy of the data must be available and more than 50% of the components (votes) of that object must also be available in the cluster. Therefore, changes to the VM storage policy could result in additional witness components being created, or indeed, in the case of introducing a policy with less requirements, there could be fewer witnesses.
You can actually see the configuration changes taking place in the vSphere UI during this process. Select the VM that is being changed, click its manage tab, and then choose the VM storage policies view, as shown in Figure 4.9. Although this view does not show all the VM storage objects, it does display the VM home namespace, and the VMDKs are visible.
Figure 4.9 VM Storage Policy view in the vSphere client showing component reconfiguration
In VSAN 6.0, there is also a way to examine all resyncing components. Select the VSAN cluster object in the vCenter inventory, then select monitor, Virtual SAN, and finally “resyncing components” in the menu. This will display all components that are currently resyncing/rebuilding. Figure 4.10 shows the resyncing dashboard view, albeit without any resyncing activity taking place.
Figure 4.10 Resyncing activity as seen from the vSphere Web Client
Objects, Components, and Witnesses
A number of new concepts have been introduced in this chapter so far, including some new terminology. Chapter 5, “Architectural Details,” covers in greater detail objects, components, and indeed witness disks, as well as which VM storage objects are impacted by a particular capability in the VM storage policy. For the moment, it is enough to understand that on VSAN, a VM is no longer represented by a set of files but rather a set of storage objects. There are five types of storage objects:
VM home namespace
Snapshot delta disks
Although the vSphere Web Client displays only the VM home namespace and the VMDKs (hard disks) in the VM > monitor > policies > physical disk placement, snapshot deltas and VM swap can be viewed in the cluster > monitor > Virtual SAN > virtual disks view. We will also show ways of looking at detailed views of all the storage objects, namely delta and VM swap, in Chapter 10, “Troubleshooting, Monitoring, and Performance,” when we look at various monitoring tools available to VSAN.