VMWare NSX Active-Active Data Center Design with Global Load balancer
DataCenters are traditionally built as active – passive or production and DR from years. This is the preferred choice for many organizations for various reasons listed below. with the growing trend and availability of technology like VMWare NSX and Global load balancers an organization can overcome these shortcomings and save the investment they made for secondary datacenter. Instead of using as DR Data Center, it can be used as secondary site by hosting some of the work loads.
Shortcomings with traditional Data Centers ( Prod/DR)
- Organization invested heavily on DR which they will use hardly for 1-2 weeks in a year.
- Waste of investment, Human resources to operate and maintain, power and cooling.
- Complex failover scenarios and complex infra designs.
- Changing of IP’s when moved to other site and testing takes mininum 2-3 hours per application.
- Operations teams forgot to update the firewall policies / antivirus exclusions and so on in DR.
- Application team used IP’s everywhere.
VMWare NSX along with other VMWare products like SRM and vSphere has addressed these issues which hook the organizations to traditional way of doing IT. But there are some questions still lingering around. Working on VMWare products and global load balancers helped me to understand the need to write this article.
We will be covering only the important pieces which always cause confusions in designing active-active data center.
- Why do we need 1600 MTU across the sites for Cross VC NSX to work ?
- Use of Local egress can alone solve the problem and do magic of active-active data center ?
- How to setup / configure the global load balancer in active-active data center ?
- What is Egress and Ingress ?
We don’t want to reinvent the whole wheel again in this post by putting the same designs detailed in VMWare and partner design guides listed below. We will be covering only the critical aspects which seems not explained in detailed in the VMWare or partner design guides which is the concern for most.
Contents of the Post
Achieving Active – Active DC with NSX
As said earlier, i am not trying to reinvent the wheel again. the basic design for active-active DC from VMWare is covered in Cross-VC NSX design guide listed below and F5 design guide for GTM is also listed below.
Multi Site Cross VC NSX Design guide
NSX-V: Multi-site Options and Cross-VC NSX Design Guide
Enhanced Disaster Recovery with Cross-VC NSX and SRM
Cross-VC NSX for Multi-site Solutions
VMWare NSX and F5 design guide
Cross VC (active-active) basic requirements:
- MTU 1600 or above configured on all L2 and L3 devices across the sites where VXLAN packet will traverse.
- Latency across sites 150 ms or less ( recommended 100 ms)
- Having Global Load balancer to solve Ingress issue.
- NSX components needs to be licensed on both sites.
- Use separate Segment ID pool for universal objects and local objects on all sites.
Cross VC MTU Requirement
Its always clear that NSX-V will add 54 bits of payload for VXLAN to work which makes a default packet of 1500 MTU leaving from VM encapsulated in 1554 VXLAN packet. To make everyone life easy its always said 1600 MTU is required, as anything higher than required 1554 like 1600 will do the job.
It is clear that with in the data Center we need 1600 MTU configured on all networking equipment which makes sense. But why 1600 MTU on perimeter routers and ISP switches and routers ? . we will discuss this in detail in this section.
With in the Data Center:
Being clear that we need 1600 MTU for VXLAN to work. if a VM on NSX prepared host wants to communicate with other VM in another host. VTEP will encapsulate packet to VXLAN packet and send to destination host for this reason we need 1600 or more MTU on the TOR/leaf/spine or in some cases L3 devices.
Across Data Center:
NSX basic functionality is extending L2 or layer 2 subnets across sites. Meaning to say we can have a VM1 in site 1 and VM3 in site 2 as shown above residing in same subnet. Site 1 and site 2 are completely physically different data centers connected over MPLS or leased line.
Imagine my VTEP vLan in site 1 is 10.20.40.0/24 and in site 2 is 10.30.40.0/24, these are separate vLans and should be routed across the sites. My VTEP’s on ESXi host IP’s are 10.20.40.10,11 and 10.30.40.10,11 in site1 and site2 respectively.
How L2 or logical switches are extended ?
VTEP’s will route the traffic across sites within same logical switch or other logical switch. so the dependency on physical network is isolated here.
VM1 Needs to communicate with VM2 in other site ( same approach for with in logical switch or different logical switch)
Step 1 : VM1 will send standard 1500 MTU packet to VM2 in site 2.
Step 2 : VTEP on ESXi host responsible for VM1 encapsulates the packet and sends out VXLAN packet with 1600 MTU (in theory 1554). VTEP (10.20.40.10) knows that VM2 is residing on other site on host with VTEP ( 10.30.40.10). which it will learn from universal control cluster. VTEP 10.20.40.10 sends 1600 MTU packet to 10.30.40.10.
Step 3 : TOR switch or leaf switch will just pass the traffic as its L2.
Step 4 : Core router knows that the destination vlan for VTEP (10.30.40.0/24) is on other end from perimeter router and sends to it.
Step 5 : perimeter router send the packet to ISP routers ( MPLS or leased Line)
Step 6 : ISP switches and routers will route the packet 1600 MTU to site2 peremeter router.
Step 7 : site 2 peremeter router send to core router 1600 MTU.
Step 8 : Core router will send to TOR or leaf the packet.
Step 9 : TOR in site 2 will forward the packet to VTEP IP (10.30.40.10)
Step 10 : VTEP (10.30.40.10 ) will decapsulate VXLAN traffic and sends the 1500 MTU packets which is originated form VM1 in site1 to VM2 in site2.
Step 11 : VM2 in site 2 received the packet as if VM1 is in the same subnet and goes on.
The process is same if they want to talk to VM on another logical switch ( other vxlan ls).
For this reason we need 1600 MTU across all switches and routers where VXLAN packet will traverse from site1 VTEP vlan to Site 2 VTEP vlan.
Cross VC Routing Design
There are two major issues with network traffic when we design active-active data center.
- Egress traffic ( traffic leaving the VM’s and going out) – addressed by local egress on Universal DLR.
- Ingress traffic ( traffic coming from outside to datacenter) – We cannot advertise the same routes from two sites – addressed by global load balancer.
What is Local Egress ? How its solved in NSX ?
Controlling the VM traffic to leave from site site edges not other site. When a VM in web vxlan wants to send traffic to outside nsx network, the logical router in kernel needs to know the routes for external vLans . traffic from site1 should leave only from that site not from other site. Imagine it the logical router knows the edges in site1 and edges in site 2 it will send traffic to other site. if allowed the traffic will leave from other site not from the same site.
When local egress is enabled while deploying the Universal DLR. Universal Control Cluster will update the logical routing instance with routes in that site based on the local ID of NSX manager. So primary site logical router kernel module will be aware of only site 1 routes and site 2 ESXi logical router will know only site 2 routes. So problem solved, traffic will leave only from that site.
Routing Design to achieve active -active:
- Create two transit logical switches one for site 1 and another for site 2.
- Connect site 1 transit logical switch as uplink for UDLR control VM in site 1 and peer with EDGES as if you are in a single site.
- Navigate to Edges section in web client – select secondary NSX manager – Select the UDLR – Deploy Control VM in site 2 manually.
- Connect site 2 transit logical switch as uplink for secondary UDLR control VM in site 2 deployed in step 3 and peer with EDGE’s in site 2 as if you are in a single site.
- Create summary static routes in site 1 and Site 2 edges pointing towards the respective UDLR control VM forwarding address.
Having multiple environments ?
Its a common case to have DMZ, Production and Staging which needs to be completely logically isolated or sometimes physically isolaed. If you have such use case, use single transport Zone and deploy separate Universal DLR for each segment like separate for PROD, DMZ and staging. Routing design described above stays the same for all.
Global Load Balancer
Why Global Load balancer is required ?
We cannot advertise the VXLAN’s routes from both sites atleast to outside world or external entities. To solve this issue we can use the global load balancer featues like F5 GTM with LTM or Netscaler GSLB. Both works perfectly fine. As the GTM or GSLB will have their own site IP separate in each site it will be easy for external users to access the services. these external IP or Public IP will be natted to your actual GTM /GSLB site VIP configured on the F5 or Netscaler.
Being knowing the requirement of global load balancer lets move on to how to deploy and load balance the applications if local egress is enabled.
Deployment topology
As said earlier, F5 and other vendors done great job in detailing this so i wont repeat it. But i would recommend placing F5 or Nescaler as per topology 1 described in this guide. Its very simple keep F5 or Netscaler outside your NSX environment on a vLan which is the common case as we use these for load balancing other vLan based infra also along with NSX.
Scenario 1: Cross Site Load Balancing
Imagine you have two applications which needs to be published through F5 or Netscaler. Never load balance across the sites even for active active or active passive application load if local egress is enabled on NSX side. F5 or Netscaler in site1 can reach VM in site 2 but VM doesn’t know that F5 or Netscaler exist in site1 as its masked with local egress.
Some people does this with natting VM IP in site 2 with some other which can be routable and load balance, but you are defaating the purpose of active- active DC with that. So don’t do this cross load balancing in active-active scenario if you have GSLB or GTM licenses.
Some people does the natting described above if they don’t have GTM or GSLB licenses. With natting it works, we can’t say to buy these GTM or GSLB if someone dont have them. But its better to have them.
Scenario 2: Active – Active Website or services
This is the actual purpose of active-active DC, publishing applications or websites from both sites. Load balancer the VM’s for app1 and app2 in that respective site and configure GTM or GSLB across sites as active-active based on your organization requirements like geo location base or client subnet base or so on.
Most of the critical workloads can be done with this scenario but it has dependency on the application nature and its database also.
Scenario 3: Active – Passive Website or services
This is another scenario for GTM / GSLB. Don’t get me wrong when i say active-passive. your data center is active- active but your applications are working as active-passive. I have seen many cases where the applications or databases have limitations due to legacy config and so on to work as active-active. Non critical workloads can be configured in this way if application or DB requirements for active-active is costing a lot.
To solve existing legacy issues, use Site1 as active site for some applications and site2 as active site for other applications and respective other sites will act as failover sites in case of site level disaster.
So typically VM’s of applications in that site will be load balanced in that site, for app1 sasy site 1 is active and site 2 is passive. for app2 site 2 is active and site1 is passive.
- please note your datacenter infra is active-active but your application can be configured in this way if any limitations are in place with apps and DB’s.
- The same VM’s can be replicated to other sites and brought up with SRM ( note SRM work in one way, it require some design tweaks as well)
Installing and Configuring NSX
Please follow below steps to implement active-active NSX.
- Install vCenter servers one SSO domain and two sites with 1 vCenter each at minimum by following this post.
- Install NSX components by following the NSX installation series as its the same on both sites.
- Add the service account used as NSX enterprise administrator on both NSX Managers.
- Promote the site1 NSX manager as primary and add secondary NSX Manager from primary.
- Deploying EDGE’s and DLR , Configure Routing by following part 4 and 5 in this series.
- Configure the Routing as described in the design above.
- Configure the F5 GTM/LTM and Netscaler GSLB following the design described above.
Conclusion
To conclude for utilizing the DC infra across all sites NSX with global load balancer is the choice. It might be an additional cost to invest in day 1 but in long organizations save a lot of money on their infrastructure or during the infra tech refresh. Even if you are having a standlone DR, people buy almost 60 % capacity of production , backup, firewalls, core and l2 switches, tape libraries for backup and so on in DR.
Introducing NSX which will be complemented by global load balancer will help achieve active-active data center, not only that saves a lot on the infra investment and remove the long downtimes and depencency on each site.
Hope this post is helpful, please share your comments and feedback if you like to add to it.
Good post.
Suggestion – can you do a similar post with NSX-T instead? That’s the direction VMware are taking their SDN in.
Sure, will plan in coming days..
Where you able to review “Active/Active” with “local egress” on NSX-T?
Dear
Active-Active is kind of possible with NSX-T 2.4 but it is not like NSX-V. There is no concept of local egress in NSX-T as its works in an entirely different fashion than NSX-V. I am working on the post. Hopefully it will get matured in the coming NSX-T releases.
Thanks,
Hi Siva,
so without local egress in NSX-T for multi-site environment with DC1 and DC2, all VM traffic regardless whether sitting in DC1 or DC2v, will only route through single site which in this case is DC1, to exit out to internet ? If this is the case, does it NSXV routing for multi-site DC is better which can respond to local egress which traffic from VM can route via closest DC ?
Dear
Yes at this point NSX-T is not fully capable to deliver the local egress. Some major enhancements are coming in 2.5 which is coming soon, stay tuned to it.
Thanks
Siva
Nice one
Thank you for informations,
Please, SRM of vmware is for Active-Active or for Active-Passive.
Best regards.
Dear
SRM is for active-passive.
Thanks,
Thanks Siva for posting this article, am looking for a similar set up with NSX-T.
Please let me know if we can do a Active-Active Setup for Multi Site without a DR solution where NSX-T Manager nodes(2) deployed on one Site and one node on the other site which are clustered.
I am planning to do it soon.
Hi Siva,
Just wanted to find out if you had made any progress on this update utilizing NSX-T? Thanks.
Thanks for this great article. It seems that GTM/GSLB, NSXT-3.2 federation, and vSAN stretched cluster will make active active design with 0 RPO and 0 RTO. Looking for similar design with combination of these three technologies.