Home > Articles

  • Print
  • + Share This
This chapter is from the book

Foundation Topics

VXLAN Introduction

Multitier applications have long been designed to use separate Ethernet broadcast domains or virtual local area networks (VLANs) to separate tiers within the application. In a vSphere environment, the number of multitier applications can be quite large, which eats up the number of available VLANs and makes it challenging to scale the virtual environment. For example, if a client has 100 four-tier applications, the client may need 400 separate Ethernet broadcast domains or VLANs to support these applications. Now multiply that by 10 clients. You are basically hitting the limit on how many Ethernet broadcast domains you can support using VLANs. As the virtual machines (VMs) for these applications are distributed among multiple vSphere clusters or even different data centers, the Ethernet broadcast domains must be spanned across the physical network, necessitating the configuration of Spanning Tree Protocol to prevent Ethernet loops.

Virtual Extensible LAN (VXLAN) addresses the Layer 2 scaling challenges in today’s data centers by natively allowing for the transparent spanning of millions of distinct Ethernet broadcast domains over any IP physical network or IP transport, reducing VLAN sprawl and thus eliminating the need to enable Ethernet loop-preventing solutions such as Spanning Tree.



VXLAN is an open standard supported by many of the key data center technology companies, such as VMware. VXLAN is a Layer 2 encapsulation technology that substitutes the usage of VLAN numbers to label Ethernet broadcast domains with VXLAN numbers. A traditional Ethernet switch can support up to 212 (4096) Ethernet broadcast domains or VLAN numbers. VXLAN supports 224 Ethernet broadcast domains or VXLAN numbers. That is 16,777,216 Ethernet broadcast domains. A VXLAN number ID is referred to as VNI. There is a one-to-one relationship between an Ethernet broadcast domain and a VNI. A single Ethernet broadcast domain can’t have more than one VNI. Two distinct Ethernet broadcast domains can’t have the same VNI.

Figure 4-1 shows a traditional design with two ESXi hosts in different racks, each one with a powered on VM. If both VMs need to be in the same Ethernet broadcast domain, the broadcast domain must be spanned, or extended, across all the Ethernet switches shown in the diagram. This makes it necessary for either the Spanning Tree Protocol to be configured in all the Ethernet switches or a more expensive loop-preventing solution such as Transparent Interconnection of Lots of Links (TRILL) to be deployed. With VXLAN deployed, the ESXi hosts can encapsulate the VM traffic in a VXLAN frame and send it over the physical network, which can be IP-based rather than Ethernet-based, thus removing the need to configure Spanning Tree or deploy solutions such as TRILL.

Figure 4-1

Figure 4-1 Spanning broadcast domain across multiple ESXi racks

Traditionally, any network technology that encapsulates traffic the way VXLAN does is called a tunnel. A tunnel hides the original frame’s network information from the IP physical network. A good example of a tunnel is Genetic Routing Encapsulation (GRE), which hides Layer 3 and Layer 4 information from IP network devices, although GRE could be set up to also hide Layer 2 information. VXLAN tunnels hide Layer 2, Layer 3, and Layer 4 information. It is possible to deploy a new IP network topology by just using tunnels, without having to do major reconfiguration of the IP physical network. Such a network topology is called an overlay, whereas the IP physical network that switches and routes the tunnels that make up the overlay is called the underlay.

Just as GRE requires two devices to create and terminate the tunnel, VXLAN requires two devices to create and terminate VXLAN tunnels. A device that can create or terminate the VXLAN tunnel is called the VXLAN Tunnel Endpoint (VTEP). NSX enables ESXi hosts to have VTEPs. A VTEP performs these two roles:

  • Receive Layer 2 traffic from a source, such as a VM, in an Ethernet broadcast domain, encapsulating it within a VXLAN frame and sending it to the destination VTEP.

  • Receive the VXLAN frame, stripping the encapsulation to reveal the encapsulated Ethernet frame, and forwarding the frame toward the destination included in the encapsulated Ethernet frame.

Figure 4-2 shows an Ethernet frame from a VM encapsulated in a VXLAN frame. The source VTEP of the VXLAN frame is a VMkernel port in the ESXi host. You can see the encapsulated Ethernet frame, or original frame, and the new header, thus creating the VXLAN overlay.

Figure 4-2

Figure 4-2 VXLAN encapsulation


The VXLAN frame contains the following components:

  • New Layer 2 header distinct from the encapsulated Layer 2 header. This header has new source and destination MAC addresses and a new 802.1Q field.

    • This header is 14 bytes long if not using 802.1Q.

    • If using 802.1Q, this header is 18 bytes long.

    • Class of Service (CoS) markings copied from the original frame’s 802.1Q field in the Layer 2 header, if any.

  • New Layer 3 header distinct from the encapsulated Layer 3 header. This header has new source and destination IP addresses, including

    • The source and destination IPs are VTEPs. In some cases the destination IP could be a multicast group (we expand further on this during Chapter 5, “NSX Switches”).

    • This header is 20 bytes long, with no extensions.

    • DSCP markings, if any, are copied from the encapsulated DSCP files in the Layer 3 header.

    • The do not fragment (DF) bit is set to 1.

  • New Layer 4 header distinct from the encapsulated Layer 4 header. This header is always UDP.

    • This header is be 8 bytes long.

    • NSX VTEPs use a destination port of 8472. As of April 2013, the standard VXLAN UDP port is 4789. NSX supports changing the UDP port number via the NSX APIs. We cover the NSX APIs in Chapter 18, “NSX Automation.”

    • The source port is derived from the encapsulated Layer 4 header.

  • New VXLAN header.

    • This header is 8 bytes long.

    • 3 bytes are dedicated for VNI labeling of the tunnel.

    • 4 bytes are reserved for future use.

    • 1 byte is dedicated for flags.

To aggregate a few things stated in the preceding content about VXLAN: Any QoS markings, such as DSCP and CoS from the VM Ethernet frame being encapsulated, are copied to the VXLAN frame, and the destination UDP port of the VXLAN frame is derived from the header information from the encapsulated frame. For this to work, VXLAN has to support virtual guest VLAN Tagging (VGT). Without VGT support, the VM’s guest OS couldn’t do QoS markings. If the encapsulated frame does not have any QoS markings, none would be copied to the VXLAN frame; however, there is nothing stopping you from adding QoS markings directly to the VXLAN frame.

Then there is the part where the VXLAN frame traverses the physical network, called the VXLAN underlay or simply underlay. The underlay uses VLANs. It is almost certain that the VXLAN underlay will place the VXLAN frames in their own Ethernet broadcast domain, thus requiring its own VLAN. The VLAN used by the underlay for VXLAN frames is referred to as the VXLAN VLAN. If the ESXi host with the source VTEP is connected to a physical switch via a trunk port, the ESXi host could be configured to add a VLAN tag, 802.1Q, to the VXLAN frame or send the VXLAN frame without a VLAN tab, in which case the physical switch’s trunk needs to be configured with a native VLAN.


All this means that VXLAN encapsulation adds 50+ bytes to the original frame from the VM. The 50+ byes come from the following addition:

  • Original Layer 2 (minus Frame Check Sum) + VXLAN Header + Outer Layer 4 Header + Outer Layer 3 Header

  • Without Original Frame 802.1Q field: 14 + 8 + 8 + 20 = 50

  • With Original Frame 802.1Q field: 18 + 8 + 8 + 20 = 54

VMware recommends that the underlay for VXLAN support jumbo frames with an MTU of at least 1600 bytes to support VMs sending frames with the standard 1500 bytes MTU. This includes any routers that are part of the underlay; otherwise, the routers will discard the VXLAN frames when they realize they can’t fragment the VXLAN frames with more than 1500 bytes payload. ESXi hosts with VTEPs also configure the VXLAN tunnel with the Do Not Fragment bit, DF, in the IP header of the VXLAN overlay to 1.

Figure 4-3 shows two VMs on the same Ethernet broadcast domain communicating with each other. The two VMs are connected to the same VNI, and the two ESXi hosts have the VTEPs. This diagram does not show the nuances of how the VTEPs know about each other’s existence or how they determine where to forward the VXLAN frame. Chapter 5 covers these details in more depth.

Figure 4-3

Figure 4-3 Virtual machine communication via VXLAN

NSX Controllers

The NSX Controllers are responsible for most of the control plane. The NSX Controllers handle the Layer 2 control plane for the logical switches, and together with the distributed logical router control virtual machine, the NSX Controllers handle the Layer 3 control plane. We review the role of the Layer 3 control plane and the distributed logical router control virtual machine in Chapter 7, “Logical Router.”


For Layer 2, the NSX Controllers have the principal copy of three tables per logical switch, which are used to facilitate control plane decisions by the ESXi host. The three tables are

  • VTEP table: Principal table that lists all VTEPs that have at least one VM connected to the logical switch. There is one VTEP table per logical switch.

  • MAC table: Principal table containing the MAC addresses for VMs connected to logical switches as well as any physical end system in the same broadcast domain as the logical switch.

  • ARP table: Principal table containing the ARP entries for VMs connected to logical switches as well as any physical end system in the same broadcast domain as the logical switch.

For Layer 3, the NSX Controllers have the routing table for each distributed logical router as well as the list of all hosts running a copy of each distributed logical router.


NSX Controllers do not play any role in security, such as the distributed firewall, nor do they provide control plane services to the NSX Edge Service Gateway.

Deploying NSX Controllers

The NSX Controllers are virtual appliances deployed by the NSX Manager. The NSX Controllers must be deployed in the same vCenter associated with NSX Manager. In our examples from the figures, that would be vCenter-A if the NSX Controller is from NSXMGR-A. At least one NSX Controller must be deployed before logical switches and distributed logical routers can be deployed in an NSX Manager with a Standalone role.

Deploying NSX Controllers might be the most infuriating thing about setting up an NSX environment. I restate some of this in context a little later, but in short if NSX Manager can’t establish communication with the NSX Controller after it is deployed, it has the NSX Controller appliance deleted. The process of deploying the NSX Controller can take a few minutes or more, depending on the available resources in the ESXi host where you deploy it and the datastore. If the NSX Controller deployment fails for whatever reason, NSX Manager doesn’t attempt to deploy a new one. You can view the NSX Manager’s log to find the reason to why the deployment failed and then try again. But you won’t be doing much networking with NSX until you get at least one NSX Controller deployed.

Let’s now cover the steps to deploying the NSX Controllers, but I wanted to point out this little annoyance first. A single NSX Controller is all that is needed to deploy logical switches and distributed logical routers; however for redundancy and failover capability, VMware supports only production environments with three NSX Controllers per standalone NSX Manager. The NSX Controllers can be deployed in separate ESXi clusters as long as

  • Each NSX Controller has IP connectivity with NSX Manager, over TCP port 443.

  • Each NSX Controller has IP connectivity with each other, over TCP port 443.

  • Each NSX Controller has IP connectivity with the management VMkernel port of ESXi hosts that will be part of the NSX domain over TCP port 1234.

The following steps guide you in how to deploy NSX Controllers via the vSphere Web Client. You can also deploy NSX Controllers using the NSX APIs.


You must be an NSX administrator or enterprise administrator to be allowed to deploy NSX Controllers. We cover Role Based Access Control (RBAC), in Chapter 17, “Additional NSX Features.”

  • Step 1. From the Networking and Security home page, select the Installation field.

  • Step 2. Select the Management tab.

  • Step 3. In the NSX Controller Nodes section click the green + icon, as shown in Figure 4-4.

    Figure 4-4

    Figure 4-4 Add NSX Controller

  • Step 4. In the NSX Controller Wizard, select the NSX Manager that would deploy the NSX Controller.

    The vSphere Web Client supports multiple vCenters, and thus multiple NSX Managers.

  • Step 5. Select the data center on which you are adding the NSX Controller.

  • Step 6. Select the datastore where the NSX Controller will be deployed.

  • Step 7. Select the ESXi cluster or resource pool where the NSX Controller will be deployed.

  • Step 8. Optionally, select the ESXi host and folder where the NSX Controller will be deployed. If the ESXi cluster selected in step 5 is configured with DRS with automatic virtual machine placement, you can skip the host selection.

  • Step 9. Select the standard portgroup or vDS portgroup where the NSX Controller’s management interface will be connected. All communication from the NSX Controller to NSX Manager, other NSX Controllers, and the ESXi hosts will take place over this connection.

  • Step 10. Select the pool of IPs from which the NSX Controller will be assigned an IP by the NSX Manager.

    If no IP pool exists, you have the option to create one now. We review the creation of an IP pool later in this chapter.

  • Step 11. If this is your first NSX Controller, you need to provide a CLI password, as shown in Figure 4-5. You do not need to provide a password for subsequent NSX Controllers as the NSX Manager automatically assigns them all the same password from the first deployed NSX Controller. The default username of the CLI prompt is admin.

    Figure 4-5

    Figure 4-5 Adding first NSX Controller

When NSX Controllers get deployed, they automatically form a cluster among themselves. The first NSX Controller needs to be deployed and have joined the NSX Controller cluster by itself before the other NSX Controllers can be deployed. If you try to deploy a second NSX Controller before the first one is deployed, you get an error message.

When NSX Manager receives the request to deploy an NSX Controller from vCenter, who got it from the vSphere Web Client, or when NSX Manager receives the request via the NSX APIs, the following workflow takes place:

  • Step 1. NSX Manager gives the NSX Controller off to vCenter to deploy, per your configurations during the Add NSX Controller Wizard. This includes

    • The data center, datastore, and cluster/resource pool to place the NSX Controller

    • The ovf import specifications, which includes the IP from the IP pool, the private and public certificates for communication back to NSX Manager, and the cluster IP, which is the IP of the first NSX Controller

    • A request to place the NSX Controller in the Automatic Startup of the ESXi host

  • Step 2. vCenter deploys the NSX Controller, powers it on, and then tells NSX Manager the Controllers are powered on.

  • Step 3. NSX Manager makes contact with the NSX Controller.

If NSX Manager cannot establish an IP connection to the NSX Controller to complete its configuration, the NSX Manager has vCenter power off the NSX Controller and delete it.

Verifying NSX Controllers

You can verify the status of the NSX Controller installation by selecting the Installation view from the Networking and Security page, as shown in Figure 4-6.

Figure 4-6

Figure 4-6 An NSX Controller successfully deployed

In this view you can verify the following:

  • Controller IP Address: The IP address of the NSX Controller. This is one of the IP addresses from the IP pool. Clicking on the controller IP address brings up information about the ESXi host and datastore the NSX Controller is in, as shown in Figure 4-7.

    Figure 4-7

    Figure 4-7 NSX Controller details

  • ID: The ID of the NSX Controller. This ID is assigned by the NSX Manager that is communicating with the NSX Controller and has no impact on the role or function of the NSX Controller.

  • Status: This is the status of the NSX Controller. The statuses we care about are Deploying and Normal.

    • Deploying is self-explanatory.

    • Disconnected means the NSX Manager lost connectivity to the NSX Controller.

    • Normal means the NSX Controller is powered up and NSX Manager has normal operation communication with it.

  • Software Version: The version of NSX software running in the NSX Controller. The version number is independent of the NSX Manager’s version.

  • NSX Manager: The NSX Manager that is communicating with this NSX Controller. Yes, this is here because a single vSphere Web Client supports multiple vCenters and thus Multiple NSX Managers. If one of the NSX Managers is participating in cross vCenter NSX, a sixth column becomes visible:

  • Managed By: The IP of the Primary NSX Manager that deployed the NSX Controller.

If you assign a role of Primary to an NSX Manager, the NSX Manager’s three NSX Controllers become NSX universal controllers. NSX universal controllers can communicate with Secondary NSX Managers in the same cross vCenter NSX domain as well as Secondary NSX Manager’s participating entities such as ESXi hosts. Before you add Secondary NSX Managers, their existing NSX Controllers, if any, must be deleted.

You can also verify the deployment of the NSX Controllers by viewing the NSX Controller virtual machine in the Host and Clusters or VM and Templates view. The NSX Controller is deployed using the name NSX_Controller_ followed by the NSX Controller’s UUID. Figure 4-8 shows the first NSX Controller in the Host and Clusters view. Notice in Figure 4-8 the number of vCPUs, memory, memory reservation, and HDD configured in the NSX Controller.

Figure 4-8

Figure 4-8 NSX Controller’s virtual machine Summary view

Each NSX Controller gets deployed with these settings:

  • 4 vCPUs

  • 4 GB vRAM, with 2 GB reservation

  • 20 GB HDD

  • 1 vNIC

  • VM hardware version 10


VMware does not support changing the hardware settings of the NSX Controllers.


If the NSX Manager is participating in a Secondary role in cross vCenter NSX, the NSX Manager will not have any NSX Controllers of its own. Instead the Secondary NSX Managers create a logical connection to the existing NSX universal controllers from the Primary NSX Manager in the same cross vCenter NSX domain.

Creating an NSX Controller Cluster

When more than one NSX Controller is deployed, the NSX Controllers automatically form a cluster. They know how to find each other because NSX Manager makes them aware of each other’s presence. To verify that the NSX Controller has joined the cluster successfully, connect to the NSX Controllers via SSH or console using the username of admin and the password you configured during the first NSX Controller deployment. Once logged in the NSX Controller, issue the CLI command show control-cluster status to view the NSX Controller’s cluster status. You need to do this for each NSX Controller to verify its cluster status. Figure 4-9 shows the output of the command for an NSX Controller that has joined the cluster successfully.

Figure 4-9

Figure 4-9 Output of show control-cluster status

Figure 4-9 depicts the following cluster messages:

  • Join status: Join complete. This message indicates this NSX Controller has joined the cluster.

  • Majority status: Connected to cluster majority. This message indicates that this NSX Controller can see the majority of NSX Controllers (counting itself). If this NSX Controller were not connected to the cluster majority, it would remove itself from participation in the control plane until it can see the majority of NSX Controllers again.

  • Restart status: This controller can be safely restarted.

  • Cluster ID: {UUID}. This is the Universal Unique ID of the cluster.

  • Node UUID: {UUID}. This is the Universal Unique ID of this NSX Controller.


The clustering algorithm used by the NSX Controllers depends on each NSX Controller having IP communication with a majority of the NSX Controllers, counting itself. If the NSX Controller does not belong to the majority, or quorum, it removes itself from control plane participation. To avoid a split-brain situation where no NSX Controller is connected to the cluster majority and potentially each one removing itself from control plane participation, VMware requires that three of the NSX Controllers be deployed in production environments.

Figure 4-10 shows the output of the command show control-cluster startup-nodes, which shows the NSX Controllers that are known to be cluster members. All NSX Controllers should provide the same output. You can also issue the NSX Manager basic mode command show controller list all to list all the NSX Controllers the NSX Manager is communicating with plus their running status.

Figure 4-10

Figure 4-10 Output of show control-cluster startup-nodes

Additional CLI commands that could be used in the NSX Controllers to verify cluster functionally and availability are as follows:

  • show control-cluster roles: Displays which NSX Controller is the master for different roles. We cover roles in the next section.

  • show control-cluster connections: Displays the port number for the different roles and the number of established connections.

  • show control-cluster management-address: Displays the IP used by the NSX Controller for management.

We review additional CLI commands in NSX Manager and NSX Controllers related to logical switches and distributed logical routers in Chapter 5 and Chapter 7.

NSX Controller Master and Recovery

When deploying multiple NSX Controllers, the control plane responsibilities for Layer 2 and Layer 3 are shared among all controllers. To determine which portions each NSX Controller handles, the NSX Controllers cluster elects an API provider, Layer 2 and Layer 3 NSX Controller Master. The masters are selected after the cluster is formed. The API provider master receives internal NSX API calls from NSX Manager. The Layer 2 NSX Controller Master assigns Layer 2 control plane responsibility on a per logical switch basis to each NSX Controller in the cluster, including the master. The Layer 3 NSX Controller Master assigns the Layer 3 forwarding table, on a per distributed logical router basis, to each NSX Controller in the cluster, including the master.

The process of assigning logical switches to different NSX Controllers and distributed logical routers to different NSX Controllers is called slicing. By doing slicing, the NSX Controller Master for Layer 2 and Layer 3 distributes the load of managing the control plane for logical switches and distributed routers among all the NSX Controllers. No two NSX Controllers share the Layer 2 control plane for a logical switch nor share the Layer 3 control plane for a distributed logical router. Slicing also makes the NSX Layer 2 and Layer 3 control planes more robust and tolerant of NSX Controller failures.

Once the master has assigned Layer 2 and Layer 3 control plane responsibilities, it tells all NSX Controllers about it so all NSX Controllers know what each NSX Controller is responsible for. This information is also used by the NSX Controllers in case the NSX Controller Master becomes unresponsive or fails.

If your NSX environment has only a single distributed logical router and three NSX Controllers, only one of the NSX Controllers would be responsible for the distributed logical router while the other two would serve as backups. No two NSX Controllers are responsible for the Layer 2 control plane of the same logical switch. No two NSX Controllers are responsible for the Layer 3 forwarding table of the same logical router.

When an NSX Controller goes down or becomes unresponsive, the data plane continues to operate; however, the Layer 2 NSX Controller Master splits among the surviving NSX Controllers Layer 2 control plane responsibilities for all the impacted logical switches. The Layer 3 NSX Controller Master splits among all the surviving NSX Controllers Layer 3 control plane responsibilities for all the affected distributed logical routers.

What if the NSX Controller that fails was the master? In this case, the surviving NSX Controllers elect a new master, and the new master then proceeds to recover the control plane of the affected logical switches and/or distributed logical routers. How does the new master determine which logical switches and/or distributed logical routers were affected and need to have their control plane responsibilities re-assigned? The new master uses the assignment information distributed to the cluster by the old master.

For Layer 2 control plane, the newly responsible NSX Controller queries the hosts in the transport zone so it can rep0opulate the logical switch’s control plane information. We learn about transport zones later in this chapter. For Layer 3, the newly responsible NSX Controller queries the logical router control virtual machine. We learn about the logical router control virtual machine in Chapter 7.

IP Pools

IP pools are the only means to provide an IP address to the NSX Controllers. IP pools may also be used to provide an IP address to the ESXi hosts during NSX host preparations. We review NSX host preparation later in this chapter in the section “Host Preparation.” IP pools are created by an NSX administrator and are managed by NSX Manager. Each NSX Manager manages its own set of IP pools. NSX Manager selects an IP from the IP pool whenever it needs one, such as when deploying an NSX Controller. If the entity using the IP from the IP pool is removed or deleted, NSX Manager places the IP back into the pool. The IPs in the IP pool should be unique in the entire IP network (both physical and virtual).

There are two ways to start the creation of an IP pool. The first method we mentioned during the deployment of the NSX Controllers. This option to create an IP pool is also available during NSX host preparation, which we discuss later in this chapter.

The second method involves the following steps:

  • Step 1. Select the NSX Managers field in the Networking and Security page.

  • Step 2. Select the NSX Manager you want to create an IP pool in.

  • Step 3. Select the Manage tab.

  • Step 4. Select the Grouping Objects button.

  • Step 5. Select IP Pools.

  • Step 6. Click the green + icon, as shown in Figure 4-11.

Figure 4-11

Figure 4-11 Create an IP pool

Regardless of how you choose to create an IP pool, the same IP Pool Wizard comes up, as shown in Figure 4-12.

Figure 4-12

Figure 4-12 IP Pool Wizard

In the IP Pool Wizard, populate the following information:

  • Step 1. Give the IP pool a unique name.

  • Step 2. Enter the default gateway for this IP pool. This entry cannot be changed once the IP pool is created.

  • Step 3. Enter the subnet prefix for the IP pool. For example, enter 24 for a mask for

  • Step 4. Optionally, enter the IP of the primary and secondary DNS servers.

  • Step 5. Optionally, enter a DNS suffix.

  • Step 6. Enter the range of IPs that will be part of this IP pool.

Once an IP pool is created, you can modify or delete it. To make changes to an IP pool, follow these steps:

  • Step 1. Return to Object Groupings for the NSX Manager that owns the IP pool.

  • Step 2. Select IP Pools.

  • Step 3. Select the IP pool you want to modify.

  • Step 4. Click the Edit IP Pools icon.

  • Step 5. You can change almost all fields desired, including adding IPs to the pool, except the name and the default gateway fields.

The IP pool’s IP range can’t be shrunk if at least one IP has already been assigned. An IP pool can’t be deleted if at least one IP has been already assigned.

Host Preparation

Now that you deployed your NSX Controllers, it’s time to focus on the next steps that must take place before you can start deploying your virtual network and deploying security services. The NSX Controllers can also be deployed after host preparation.

The next step is to install the NSX vSphere Infrastructure Bundles (VIBs) in the ESXi hosts that will be in the NSX domain. The VIBs give the ESXi hosts the capability to participate in NSX’s data plane and in kernel security. We do this by selecting the Host Preparation tab from the Installation view in the Networking and Security page, as shown in Figure 4-13. An alternative would be to use vSphere ESXi Image Builder to create an image with the NSX VIBs installed.

Figure 4-13

Figure 4-13 Host Preparation tab

In the Host Preparation tab you see a list of all the ESXi host clusters configured in vCenter. Under the Installation Status column, hover toward the right until the mouse is over the cog, click it and select Install. That’s it. NSX Manager pushes the VIBs to each ESXi host that is in the cluster. Successfully adding the VIBs is nondisruptive, and there is no need to place the ESXi host in maintenance. Yes, I wrote “successfully” because if the VIB installation fails you might need to reboot the ESXi host(s) to complete it, as shown in Figure 4-14. The good thing is that NSX Manager tries to reboot the ESXi host for you, first putting in Maintenance mode. The moral of this: Don’t execute any type of infrastructure changes or upgrades outside of a maintenance window. You would also need to reboot the ESXi host if you wanted to remove the NSX VIBs.

Figure 4-14

Figure 4-14 Incomplete NSX VIB installation

So what superpowers exactly are these VIBs giving the ESXi hosts? The modules and the over-and-above human capabilities they give the ESXi hosts are as follows:

  • The VXLAN module: Enables the ESXi host to have logical switches. We discuss logical switching in Chapter 5.

  • The Switch Security (SwSec) module: It is the logical switch’s assistant. It is a dvFilter that sits in Slot 1 of the IOChain and helps with Layer 2 broadcast suppression.

  • The Routing module: Enables the ESXi host to run distributed logical routers. We review distributed logical routers in Chapter 7.

  • The distributed firewall: Enables the ESXi host to do Layer 2, Layer 3, and Layer 4 security in kernel. It also allows the ESXi host to leverage, out of network, additional security services. We start the conversation about the distributed firewall and security in Chapter 15, “Distributed Logical Firewall.”

Any other superpowers? Well, maybe this can be considered as a superpower: If you add an ESXi host to a cluster that has already been prepared, the ESXi host gets the NSX VIBs automatically. How about that for cool?! And before I forget, installing the VIBs takes minimal time. Even in my nested-ESXi-hosts running lab with scant available CPU, memory, and an NFS share that is slower at delivering I/O than a delivery pigeon, the VIBs install quickly.

Figure 4-15 shows the ESXi host clusters that have been prepared with version 6.2.0 of the NSX VIBs by NSMGR-A, Have a look at the two columns to the right, the Firewall and VXLAN columns. The Firewall module has its own column because it can be installed independently from the other modules. The VIB that has the Firewall module is called VSFWD. If the Firewall status reads Enabled, with the green check mark, you could go over to the Firewall view of Networking and Security, where the distributed firewall policies get created and applied, or the Service Composer view of Networking and Security, where service chaining is configured, to start creating and applying security rules for VMs. The distributed firewall VIB for NSX 6.0 can be installed with ESXi hosts running version 5.1 or higher. For NSX 6.1 and higher, the ESXi hosts must run 5.5 or higher.

Figure 4-15

Figure 4-15 Host Preparation tab after NSX modules have been installed

The VXLAN column confirms the installation of the VXLAN VIB. The VXLAN VIB has the VXLAN module, the Security module, and the Routing module. If the column reads Not Configured with a hyperlink, the VXLAN VIB is installed. The VXLAN VIB can be installed with ESXi hosts running version 5.1 or higher; however, with version 5.1 ESXi hosts logical switches can only be deployed in Multicast Replication Mode. We cover Replication Mode in Chapter 5. For NSX 6.1 and higher, the ESXi hosts must run 5.5 or higher. The Routing module only works in ESXi hosts running vSphere 5.5 or higher. Table 4-2 shows the vSphere and vCenter version supported by each module.

Table 4-2 vSphere Versions Supported by the NSX Modules

NSX Modules

vSphere Version


5.1 or later


5.1 (only for Multicast Replication Mode) and later


5.5 or later

Host Configuration

If you want to deploy logical switches, you must complete the Logical Network Preparation tab in the Installation view. In this section you set up an NSX domain with the variables needed to create VXLAN overlays. Three sections need to be configured. If you skip any of them, you are not going to be deploying logical switches.

First, you need to tell NSX Manager how to configure the ESXi hosts. Oddly enough, you don’t start the logical network configuration from the Logical Network Preparation tab. Rather, click the Configure hyperlink in the VXLAN column in the Host Preparation tab to open the Configure VXLAN Networking Wizard. Optionally, hover toward the right and click on the cog to see a menu list and choose Configure VXLAN, as shown in Figure 4-16.

Figure 4-16

Figure 4-16 VXLAN host configuration

Figure 4-17 shows the Configure VXLAN Networking window. Here we can configure the following:

  • The vDS where the new VXLAN VMkernel portgroup will be created.

  • The IP ESXi hosts will use as their VTEP. A new VMkernel port gets created for this, typically referred to as the VXLAN VMkernel port, and it is this VMkernel port that is the VTEP. Since the ESXi host owns the VXLAN VMkernel port, it is common practice to refer to the ESXi host as the VTEP itself. Moving forward, from time to time I refer to both the ESXi hosts and the VXLAN VMkernel ports as VTEPs.

  • The number of VXLAN VMkernel ports, per ESXi hosts, that will be configured. Each VXLAN VMkernel port will have a different IP.

Figure 4-17

Figure 4-17 Configure VXLAN Networking Wizard


All ESXi hosts, per host cluster, must be in the same vDS that will be used by NSX for host configuration. NSX can work with different clusters having different vDSes. This has zero impact on the performance of VMs in the NSX domain. If running a vSphere version before 6.0, not using the same vDS across multiple clusters may impact the capability of vMotion virtual machines connecting to logical switches. We touch on this topic in Chapter 5.

The VLAN in Figure 4-17 is the VXLAN VLAN. The vDS switch selected in Figure 4-17 will be used by NSX Manager to create a portgroup for the VXLAN VMkernel port and portgroups to back the logical switches, which we cover in Chapter 5. All these portgroups will be configured by NSX Manager with the VXLAN VLAN. If the MTU configured is larger than the MTU already configured in the vDS, the vDS’s MTU will be updated. The vDS that gets assigned to the cluster for VXLAN may also continue to be used for other non-NSX connectivity, such as a portgroup for vMotion.

You can assign an IP address to the VXLAN VMkernel port by using DHCP or an IP pool. In both cases, the VXLAN VMkernel port would be getting a default gateway. This would typically present a problem for the ESXi host since it already has a default gateway, most likely pointing out of the management VMkernel port. Luckily for NSX, vSphere has supported multiple TCP/IP stacks since version 5.1. In other words, the ESXi host can now have multiple default gateways. The original default gateway, oddly enough referred to as default, would still point out of the management VMkernel port, or wherever you originally had it configured for. The new default gateway, which you probably correctly guessed is referred to as VXLAN, would point out of the VXLAN VMkernel port. The VXLAN TCP/IP stack default gateway and the VXLAN VMkernel port will only be used for the creation and termination of VXLAN overlays. Figure 4-18 shows the VMkernel ports of an ESXi host, with only the VXLAN VMkernel port using the VXLAN TCP/IP stack.

Figure 4-18

Figure 4-18 VXLAN VMkernel port with VXLAN TCP/IP stack

One final thing you can configure here is the VMKNic Teaming Policy, a name I’m not too fond of. Why couldn’t they name it VXLAN Load Share Policy? After all, this is how the vDS load shares egress traffic from the VXLAN VMkernel port. Anyhow, the selection you make here has great implications for the behavior of your VXLAN overlays. For one, the policy must match the configuration of the physical switches to which the vDS uplinks connect, which means the vDS must also be configured to match the selected policy, such as enhanced LACP.

These are the VMKNic Teaming Policy options available:

  • Fail over

  • Static EtherChannel

  • Enhanced LACP

  • Load Balance – SRCID

  • Load Balance – SRCMAC

Go back and have a look at Figure 4-17. Do you see the VTEP field at the bottom? It says 1, meaning 1 VXLAN VMkernel port is created for each ESXi host in the cluster being configured. Where did the 1 come from? NSX Manager put it there. Notice the text box for the 1 is grayed out, which means you can’t edit it. And how did NSX Manager know to put a 1 in there? Go back to the VMKNic Teaming Policy selection. If you choose anything other than Load Balance – SRCID or Load Balance – SRCMAC, NSX Manager puts a 1 in the VTEP text box.


If, on the other hand, you choose VMKNic Teaming Policy of Load Balance – SRCID or Load Balance – SRCMAC, NSX Manager creates multiple VXLAN VMkernel ports, one per dvUplink in the vDS. Now that the ESXi hosts have multiple VXLAN VMkernel ports, load sharing can be achieved on a per VM basis by pinning each VM to a different VXLAN VMkernel port and mapping each VXLAN VMkernel port to a single dvUplink in the vDS. Figure 4-19 shows the configured ESXi hosts with multiple VXLAN VMkernel ports.

Figure 4-19

Figure 4-19 ESXi hosts with multiple VTEPs

Figure 4-20 shows the logical/physical view of two ESXi hosts, each with two dvUplinks, two VMs, and two VTEPs. The VMs are connected to logical switches.

Figure 4-20

Figure 4-20 Logical/physical view of ESXi hosts with two VTEPs

Table 4-3 shows the VMKNic Teaming Policy options, the multi-VTEP support, how they match to the vDS Teaming modes, and the minimum vDS version number that supports the teaming policy.

Table 4-3 VMKNic Teaming Policies

Key Topic Element

Multi-VTEP Support

vDS Teaming Mode

vDS Version

Fail Over



5.1 or later

Static EtherChannel


Ether Channel

5.1 or later

Enhanced LACP



5.5 and later

Load Balance - SRCID


Source Port

5.5 and later

Load Balance - SRCMAC


Source MAC (MAC Hash)

5.5 and later

Now why would NSX Manager allow the option of multiple VTEPs in the same ESXi host? It allows the option because there is no other good way to load share, yes load share, egress traffic sourced from an ESXi host if the load sharing hash is using the source interface (SRCID) or the source MAC (SRCMAC). I won’t spend too long explaining why NSX Manager achieves the load sharing the way it does. I would just say, think of how the physical network would react if the source MAC in egress frames from the ESXi host were seen in more than one discrete dvUplink from the same ESXi host.

After you finish the Configure VXLAN Networking Wizard, you can go over to the Logical Network Preparation tab to verify the configuration. Figure 4-21 shows the VXLAN Transport section listing the ESXi hosts that have been configured and the details of their configuration.

Figure 4-21

Figure 4-21 ESXi host clusters that have been configured for VXLAN

In the Network view of vCenter, you can verify that the portgroup was created for connecting the VXLAN VMkernel port. Figure 4-22 shows the VXLAN VLAN for the EDG-A1 host cluster, 13, is configured in the portgroup. Notice that there are other portgroups in the same vDS. If you were to look at the vDS configuration, you would see the MTU is set to at least the size you configured in Configure VXLAN networking.

Figure 4-22

Figure 4-22 VXLAN vDS

VNI Pools, Multicast Pools, and Transport Zones

You need to undertake two more preparations for the NSX networks.

The first thing you should do is provide the range or pool of VNIs and multicast groups that NSX Manager would be using for its local use as well as do the same for cross vCenter NSX use. Local VNI pools and universal VNI pools shouldn’t overlap. Local multicast groups and universal multicast groups shouldn’t overlap either. The VNI pool can start at 5000. To create the VNI pools, go to the Segment ID section of the Logical Network Preparation tab and select the Primary NSX Manager. If you require multicast support, you can enter the multicast group pools for NSX Manager to use in the same place. We discuss multicast in the “Replication Mode” section of Chapter 5. Secondary NSX Managers can only configure local VNI and multicast group pools.

The second thing you should do is create global transport zones, at least one per NSX Manager, and a universal transport zone. When a logical switch is created, NSX Manager needs to know which ESXi hosts in the NSX domain have to be informed about the logical switch. The global transport zone is a group of ESXi host clusters under the same NSX domain that would be told about the creation of logical switches. Global transport zone only includes ESXi host clusters local to a vCenter. The universal transport zone is a group of ESXi host clusters under the same cross vCenter NSX domain that would be told about the creation of universal logical switches. Universal transport zones may include ESXi host clusters in all vCenters in the same cross vCenter NSX domain. The logical switch’s global transport zone assignment and a universal logical switch’s universal transport zone assignment are done during the creation of the switches.

A transport zone can contain as many clusters as you want. An ESXi host cluster can be in as many transport zones as you want, and it can belong to both types of transport zones at the same time. And yes, you can have as many global transport zones as your heart desires, although you typically don’t deploy more than one or two per NSX Manager. However, you can only have a single universal transport zone. More importantly, both types of transport zones can have ESXi host clusters each with a different vDS selected during Configure VXLAN networking. Again, transport zones matter only for the purpose of letting the NSX Manager know which ESXi hosts should be told about a particular logical switch or universal logical switch.

To create a transport zone, head over to the Logical Network Preparation tab, select the NSX Manager that will own the transport zone, and go to the Transport Zones section. There, click the green + sign. There you can assign the transport zone a name, select its Replication Mode, and choose the ESXi host clusters that will be part of the transport zone. If the NSX Manager is the Primary NSX Manager, you have a check box to turn this transport zone into a universal transport zone, as shown in Figure 4-23.

Figure 4-23

Figure 4-23 Creating a transport zone

As mentioned, Chapter 5 discusses what Replication Mode is. For now, you should know that if you select Multicast or Hybrid you need to create a multicast group pool in the Segment ID section mentioned previously. Finally, after a transport zone is created, you can’t change the transport zone type. However, you can modify it by adding or removing ESXi host clusters from the NSX Manager that owns the association to the vCenter that owns those clusters. If an NSX switch (a logical switch or a universal logical switch) has already been created before the ESXi host cluster is added to the transport zone, NSX Manager automatically updates the newly added ESXi hosts in the ESXi host cluster with the NSX switch information.


To add an ESXi host cluster in a transport zone, return to the Transport Zone section of the Logical Network Preparation tab and select the NSX Manager that prepared the ESXi host cluster that will be added. Select the Transport Zone and click the Connect Clusters icon. Select the ESXi host clusters you want to add and click OK.

To remove an ESXi host cluster from a transport zone, select the transport zone in the Transport Zones section and select the Disconnect Clusters icon. Select the ESXi host clusters you want to remove and click OK. For the operation to succeed, all VMs (powered on or not) in the ESXi host you want to remove must be disconnected from all logical switches that belong to the transport zone. We cover how to disconnect a VM from a logical switch in Chapter 5.

A transport zone that has any logical switches can’t be deleted. The logical switches must be deleted first. We cover how to delete logical switches in Chapter 5. To delete a transport zone, select the transport zone, then select Actions, All NSX User Interface Plugin Actions, and then select Remove.


One more note on this section. It should be clear by now that NSX Manager loves ESXi host clusters. If you add an ESXi host to an already prepared and configured ESXi host cluster, NSX Manager would make sure that the ESXi host gets the NSX VIBs, the VXLAN VMkernel ports get created with the right IP and subnets, and make the new ESXi host aware of any logical switches, and so forth. On the reverse, if you remove an ESXi host from an already prepared and configured ESXi host cluster, the ESXi host would lose its VXLAN VMkernel ports and IPs, and lose knowledge of any logical switches.

That wraps up all the prep work that needs to be done to get your NSX network and security going. The next chapter begins the coverage of the process of actually building stuff that you can put virtual machines on.

  • + Share This
  • 🔖 Save To Your Account