Now that the core SRM product is installed it's possible to progress through the post-configuration stages. Each of these stages depends highly on the previous configuration being completed correctly. It would be correct to assume that this then creates a dependency between each stage such that you must be careful about making changes once the components have been interlinked. Essentially, the post-configuration stages constitute a "workflow." The first step is to pair the two sites together, which creates a relationship between the Protected Site (NYC) and the Recovery Site (NJ). Then we can create inventory mappings that enable the administrator to build relationships between the folders, resource pools, or clusters and networks between the Protected Site and the Recovery Site. These inventory mappings ensure that VMs are recovered to the correct location in the vCenter environment. At that point, it is possible to configure the array managers. At this stage you make the sites aware of the identities of your storage systems at both locations; the SRM will interrogate the arrays and discover which datastores have been marked for replication. The last two main stages are to create Protection Groups and to create Recovery Plans. You cannot create Recovery Plans without first creating Protection Groups, as their name implies the point to the datastores that you have configured for replication. The Protection Groups use the inventory mappings to determine the location of what VMware calls "placeholder VMs." These placeholder VMs are used in Recovery Plans to indicate when and where they should be recovered and allows for advanced features such as VM Dependencies and scripting callouts. I will be going through each step in detail, walking you through the configuration all the way so that by the end of the chapter, you should really understand what each stage entails and why it must be completed.
Connecting the Protected and Recovery Site SRMs
One of the main tasks carried out in the first configuration of SRM is to connect the Protected Site SRM to the Recovery Site SRM. It's at this point that you configure a relationship between the two, and really this is the first time you indicate which is the Protected Site and which is the Recovery Site. It's a convention that you start this pairing process at the Protected Site. The reality is that the pairing creates a two-way relationship between the locations anyway, and it really doesn't matter from which site you do this. But for my own sanity, I've always started the process from the protected location.
When doing this first configuration, I prefer to have two vSphere client windows open: one on the protected vCenter and the other on the recovery vCenter. This way, I get to monitor both parts of the pairing process. I did this often in my early use of SRM so that I could see in real time the effect of changes in the Protected Site on the Recovery Site. Of course, you can simplify things greatly by using the linked mode feature in vSphere. Although with SRM new views show both the Recovery and Protected Sites at the same time, the benefits of linked mode are somewhat limited; however, I think linked mode can be useful for your general administration. For the moment, I'm keeping the two vCenters separate so that it's 100% clear that one is the Protected Site and the other is the Recovery Site (see Figure 9.1).
Figure 9.1 The Protected Site (New York) is on the left; the Recovery Site (New Jersey) is on the right.
As you might suspect, this pairing process clearly means the Protected Site SRM and Recovery Site SRM will need to communicate to each other to share information. It is possible to have the same IP range used at two different geographical locations. This networking concept is called "stretched VLANs." Stretched VLANs can greatly simplify the pairing process, as well as greatly simplify the networking of virtual machines when you run tests or invoke your Recovery Plans. If you have never heard of stretched VLANs, it's well worth brushing up on them, and considering their usage to facilitate DR/BC. The stretched VLAN configuration, as we will see later, can actually ease the administrative burden when running test plans or invoking DR for real. Other methods of simplifying communications, especially when testing and running Recovery Plans, include the use of network address translation (NAT) systems or modifying the routing configuration between the two locations. This can stop the need to re-IP the virtual machines as they boot in the DR location. We will look at this in more detail in subsequent chapters.
This pairing process is sometimes referred to as "establishing reciprocity." In the first release of SRM the pairing process was one-to-one, and it was not possible to create hub-and-spoke configurations where one site is paired to many sites. The structure of SRM 1.0 prevented many-to-many SRM pairing relationships. Back in SRM 4.0, VMware introduced support for a shared-site configuration where one DR location can provide resources for many Protected Sites. However, in these early stages I want to keep with the two-site configuration.
Installing the SRM and vCenter software on the same instance of Windows can save you a Windows license. However, some people might consider this approach as increasing their dependence on the management system of vCenter. If you like, there is a worry or anxiety about creating an "all-eggs-in-one-basket" scenario. If you follow this rationale to its logical extreme, your management server will have many jobs to do, such as being the
- vCenter server
- Web access server
- Converter server
- Update Manager server
My main point, really, is that if the pairing process fails, it probably has more to do with IP communication, DNS name resolution, and firewalls than anything else. IP visibility from the Protected to the Recovery Site is required to set up SRM. Personally, I always recommend dedicated Windows instances for the SRM role, and in these days of Microsoft licensing allowing multiple instances of Enterprise and Datacenter Editions on the same hypervisor, the cost savings are not as great as they once were.
When connecting the sites together you always log in to the Protected Site and connect it to the Recovery Site. This starting order dictates the relationship between the two SRM servers.
- Log in with the vSphere client to the vCenter server for the Protected Site SRM (New York).
In the Sites pane, click the Configure Connection button shown in Figure 9.2. Alternatively, if you still have the Getting Started tab available, click the Configure Connection link.
Figure 9.2 The status of the New York Site is "not paired" until the Configure Connection Wizard is run.
Notice how the site is marked as being "local," since we logged in to it directly as though we are physically located at the New York location. If I had logged in to the New Jersey site directly it would be earmarked as local instead.
In the Configure Connection dialog box enter the name of the vCenter for the Recovery Site, as shown in Figure 9.3.
Figure 9.3 Despite the use of port 80 in the dialog box, all communication is redirected to port 443.
When you enter the vCenter hostname use lowercase letters; the vCenter hostname must be entered exactly the same way during pairing as it was during installation (for example, either fully qualified in all cases or not fully qualified in all cases). Additionally, although you can use either a name or an IP address during the pairing process, be consistent. Don't use a mix of IP addresses and FQDNs together, as this only confuses SRM. As we saw earlier during the installation, despite entering port 80 to connect to the vCenter system, it does appear to be the case that communication is on port 443.
Again, if you are using the untrusted auto-generated certificates that come with a default installation of vCenter you will receive a certificate security warning dialog box, as shown in Figure 9.4. The statement "Remote server certificate has error(s)" is largely an indication that the certificate is auto-generated and untrusted. It doesn't indicate fault in the certificate itself, but rather is more a reflection of its status.
Figure 9.4 Dialog box indicating there is an error with the remote server certificate
Specify the username and password for the vCenter server at the Recovery Site.
Again, if you are using the untrusted auto-generated certificates that come with a default installation of SRM you will receive a certificate security warning dialog box. This second certificate warning is to validate the SRM certificate, and is very similar to the previous dialog box for validating the vCenter certificate of the Recovery Site. So, although these two dialog boxes look similar, they are issuing warnings regarding completely different servers: the vCenter server and the SRM server of the Recovery Site. Authentication between sites can be difficult if the Protected and Recovery Sites are different domains and there is no trust relationship between them. In my case, I opted for a single domain that spanned both the Protected and Recovery Sites.
At this point the SRM wizard will attempt to pair the sites, and the Complete Connections dialog box will show you the progress of this task, as shown in Figure 9.5, on the Recent Tasks of the Protected vCenter.
Figure 9.5 Pairing the sites (a.k.a. establishing reciprocity)
- At the end of the process you will be prompted to authenticate the vSphere client against the remote (Recovery) site. If you have two vSphere clients open at the same time on both the Protected and Recovery Sites you will receive two dialog login box prompts, one for each SRM server. Notice how in the dialog box shown in Figure 9.6 I'm using the full NT domain-style login of DOMAIN\Username. This dialog box appears each time you load the vSphere client and select the SRM icon.
Figure 9.6 Entering login credentials for the Recovery Site vCenter
At the end of this first stage you should check that the two sites are flagged as being connected for both the local site and the paired site, as shown in Figure 9.7.
Figure 9.7 The sites are connected and paired together; notice how communication to the vCenter in the Recovery Site used port 443.
Additionally, under the Commands pane on the right-hand side you will see that the Break Connection link is the reverse of the pairing process. It's hard to think of a use case for this option. But I guess you may at a later stage unpair two sites and create a different relationship. In an extreme case, if you had a real disaster the original Protected Site might be irretrievably lost. In this case, you would have no option but to seek a different site to maintain your DR planning. Also in the Commands pane you will find the option to export your system logs. These can be invaluable when it comes to troubleshooting, and you'll need them should you raise an SR with VMware Support. As you can see, SRM has a new interface, and even with vCenter linked mode available this new UI should reduce the amount of time you spend toggling between the Protected and Recovery Sites. Indeed, for the most part I only keep my vCenters separated in this early stage when I am carrying out customer demonstrations; it helps to keep the customer clear on the two different locations.
From this point onward, whenever you load the vSphere client for the first time and click the Site Recovery Manager icon you will be prompted for a username and password for the remote vCenter. The same dialog box appears on the Recovery Site SRM. Although the vSphere client has the ability to pass through your user credentials from your domain logon, this currently is not supported for SRM, mainly because you could be using totally different credentials at the Recovery Site anyway. For most organizations this would be a standard practice—two different vCenters need two different administration stacks to prevent the breach of one vCenter leading to a breach of all others.