VMWare NSX – Collapsed DMZ Detailed Design Guide
Most of the small and medium size customers doesn’t have enough physical equipment to host the DMZ either physically separate or logically separate. With Collapsed DMZ design this can be easily achieved by customers. We place DMZ having its own network and security devices. This can be achieved by NSX by logically separating them. We host most of the time web workload in DMZ while app and DB in internal. Just for the sake of few web VM’s its not required to host them separately in DMZ. They can be hosted in a separate NSX logical switch with which traditional security measures can be applied which is discussed in detail in this post. Using the NSX features that can be achieved without having separate hardware ( switches / firewalls ) in DMZ.
NSX routing, switching and firewalling can be done at kernel level or virtual machine vNIC level. This post is made to address a common Collapsed DMZ design of hosting production and DMZ workloads on same underlying hardware while making use of all SDDC features which NSX would offer.
This post is made to get a complete view of an SDDC and its requirements with detailed physical and connectivity designs. Please note to make things simple i am talking about one site only in this design. This design can be used as a Low level design for SDDC to save your time and efforts.
Contents of the Post
Network Virtualization Architecture
This is the high level network logical design with one cluster hosting shared production workload, NSX components and DMZ workload. Don’t be scared by looking at this. Have a look at all the design diagrams and decisions to get the complete view. NSX Data Plane: The data plane handles the workload data only. The data is carried over designated transport networks in the physical network. NSX logical switch, distributed routing, and distributed firewall are also implemented in the data plane. NSX control plane: The control plane handles network virtualization control messages. Control messages are used to set up networking attributes on NSX logical switch instances, and to configure and manage disaster recovery and distributed firewall components on each ESXi host. Carry over control plane communication on secure physical networks (VLANs) that are isolated from the transport networks used for the data plane. NSX management plane: The network virtualization orchestration occurs in the management plane. In this layer, cloud management platforms such as vRealize Automation can request, consume, and destroy networking resources for virtual workloads. The cloud management platform directs requests to vCenter Server to create and manage virtual machines, and to NSX Manager to consume networking resources.
NSX for vSphere Requirements
Below are the components and its compute requirements.
Server Component | Quantity | Location | CPU | RAM | Storage |
Platform service Controllers | 2 | Production-Mgmt Cluster | 4 | 12 | 290 |
vCenter server with Update manager | 1 | Production-Mgmt Cluster | 4 | 16 | 290 |
NSX Manager | 1 | Production-Mgmt Cluster | 4 | 16 | 60 |
Controllers | 3 | Production-Mgmt Cluster | 4 | 4 | 20 |
EDGE Gateway for Production | 4 | Production-Mgmt Cluster | 2 | 2 | 512 MB |
Production DLR Control VM (A/S) | 2 | Production-Mgmt Cluster | 1 | 512 MB | 512 MB |
IP Subnets Requirements Below vLans for Management and VTEPS will be created on the physical L3 Device in Data Center. 10.20.10.0/24 – vCenter, NSX and Controllers 10.20.20.0/24 – ESXi Mgmt 10.20.30.0/24 – vMotion 10.20.40.0/24 – Production VTEP vLan Below VXLANs subnets will be created on NSX and NSX DLR will act as gateway. 172.16.0.0/16 – Production and DMZ VXLAN’s ESXi Host Requirements:
- Hardware is compatible with targeted vSphere version. ( check with vmware compatibility guide here)
- Hardware to have min 2 CPU with 12 or more cores. ( even 8 core also works, but now 22 cores are available in market)
- Minimum 4 x 10 GB NIC Cards, if vSAN is also part of Design min 6 x 10GB NIC cards. ( if possible use 25 G or 40 G links)
- Minimum 128 GB RAM in each host. ( now a days each host is coming with 2.5 TB RAM).
Physical Design
Below is the Physical ESXi host design. Its not mandatory to keep all Prod and DMZ in separate racks. It depends on requirements and network connectivity. There is no separation for DMZ workloads. DMZ workloads will be moved to Web tier or any other logical switch. Micro segmentation will be used to secure the complete environment. A minimum of 7 Hosts to support shared management, edge, DMZ and production workloads in single Cluster . Some of the major Physical Design Considerations are below:
- Configure redundant physical switches to enhance availability.
- Configure the ToR switches to provide all necessary VLANs via an 802.1Q trunk.
- NSX ECMP Edge devices establish Layer 3 routing adjacency with the first upstream Layer 3 device to provide equal cost routing for management and workload traffic.
- The upstream Layer 3 devices end each VLAN and provide default gateway functionality.
- NSX doesn’t need any fancy stuff at Network level basic L2 or L3 functionalities from any hardware vendor will do.
- Configure jumbo frames on all switch ports with 9000 MTU although 1600 is enough for NSX.
- The management vDS uplinks for both Production and DMZ cluster can be connected to same TOR switches, but use separate vLans as shown in requirements. Only edge uplinks needs to be separate for Production and DMZ as that is what will decide the packet flow.
vCenter Design & Cluster Design
It is recommended to have One vCenter single signon domain with 2 PSC’s load balanced with NSX or external load balancer and a vCenter server will use the Load balanced VIP of PSC. vCenter Design Considerations:
- For this design only one vCenter server license is enough, but it is recommended to have separate vCenter for mgmt and NSX workload clusters if you have separate clusters.
- One single sign on domain with 2 PSC’s load balanced with NSX load balancer or external load balancer. NSX load balancer config guide is here.
- A one-to-one mapping between NSX Manager instances and vCenter Server instances exists.
If you are looking for vCenter design and implementation steps please click here for that post. One cluster for management, edge and compute ,DMZ workload and DMZ edges.
- Collapsed Cluster host vCenter Server, vSphere Update Manager, NSX Manager and NSX Controllers.
- This cluster also runs the required NSX services to enable North-South routing between the SDDC tenant virtual machines and the external network, and east-west routing inside the SDDC.
- This Cluster also hosts Compute Workload will be hosted in the same cluster for the SDDC tenant workloads.
- This Cluster will host DMZ workload.
VXLAN VTEP Design
The VXLAN network is used for Layer 2 logical switching across hosts, spanning multiple underlying Layer 3 domains. You configure VXLAN on a per-cluster basis, where you map each cluster that is to participate in NSX to a vSphere distributed switch (VDS). When you map a cluster to a distributed switch, each host in that cluster is enabled for logical switches. The settings chosen here will be used in creating the VMkernel interface. If you need logical routing and switching, all clusters that have NSX VIBs installed on the hosts should also have VXLAN transport parameters configured. If you plan to deploy distributed firewall only, you do not need to configure VXLAN transport parameters. When you configure VXLAN networking, you must provide a vSphere Distributed Switch, a VLAN ID, an MTU size, an IP addressing mechanism (DHCP or IP pool), and a NIC teaming policy. The MTU for each switch must be set to 1550 or higher. By default, it is set to 1600. If the vSphere distributed switch MTU size is larger than the VXLAN MTU, the vSphere Distributed Switch MTU will not be adjusted down. If it is set to a lower value, it will be adjusted to match the VXLAN MTU. Design Decisions for VTEP:
- Configure Jumbo frames on network switches (9000 MTU) and on VXLAN Network also.
- Use two VTEPS per servers at minimum which will balance the VTEP load. Some VM’s traffic will go from one , other VM’s from another one.
- Separate vLans will be used for Production VTEP IP pool and DMZ VTEP IP pool.
- Unicast replication model is sufficient for small and medium deployments. For large scale deployments with multiple POD’s hybrid is recommended.
- No IGMP or other needs to be configured on physical world for Unicast replication model.
- Select Load balancing mechanism as Load based on Source ID which will create two or more vTEPS based on the no of physical uplinks on the vDS.
Production Cluster VTEP Design
As shown above each host will have two VTEP’s configured. this will be automatically configured based on the policy which is selected while configuring VTEP’s.
Transport Zone Design
A transport zone is used to define the scope of a VXLAN overlay network and can span one or more clusters within one vCenter Server domain. One or more transport zones can be configured in an NSX for vSphere solution. A transport zone is not meant to delineate a security boundary. One Transport Zones will be used one for Production workload and for DMZ workload. This will help if you are planning for DR or secondary site as only One universal Transport Zone is supported, so when moved to secondary site we can have one Universal TZ and two universal DLR , one for production and one for DR.
Logical Switch Design
NSX logical switches create logically abstracted segments to which tenant virtual machines can connect. A single logical switch is mapped to a unique VXLAN segment ID and is distributed across the ESXi hypervisors within a transport zone. This logical switch configuration provides support for line-rate switching in the hypervisor without creating constraints of VLAN sprawl or spanning tree issues.
Logical Switch Names | DLR | Transport Zone |
|
Production DLR | Local Transport Zone |
Distributed Switch Design
vSphere Distributed Switch supports several NIC teaming options. Load-based NIC teaming supports optimal use of available bandwidth and redundancy in case of a link failure. Use two 10-GbE connections for each server in combination with a pair of top of rack switches. 802.1Q network trunks can support a small number of VLANs. For example, management, storage, VXLAN, vSphere Replication, and vSphere vMotion traffic. Configure the MTU size to at least 9000 bytes (jumbo frames) on the physical switch ports and distributed switch port groups that support the following traffic types.
- vSAN
- vMotion
- VXLAN
- vSphere Replication
- NFS
Two types of QoS configuration are supported in the physical switching infrastructure.
- Layer 2 QoS, also called class of service (CoS) marking.
- Layer 3 QoS, also called Differentiated Services Code Point (DSCP) marking.
A vSphere Distributed Switch supports both CoS and DSCP marking. Users can mark the traffic based on the traffic type or packet classification. When the virtual machines are connected to the VXLAN-based logical switches or networks, the QoS values from the internal packet headers are copied to the VXLAN-encapsulated header. This enables the external physical network to prioritize the traffic based on the tags on the external header.
Physical Production vDS Design
Production Cluster will have 3 vDS. Detailed Port group information will be given below.
- vDS-MGMT-PROD : to host management vLan traffic, VTEP traffic and vMotion Traffic.
- vDS-PROD-EDGE : will be used for EDGE Uplinks for North South Traffic for production traffic.
- vDS-DMZ-EDGE : will be used for DMZ EDGE Uplinks for North South Traffic. ( if you don’t have extra 10GB NIC’s you can use 1GB for edge port groups also, but there will be performance impact)
Port Group Design Decisions: vDS-MGMT-PROD
Port Group Name | LB Policy | Uplinks | MTU |
ESXi Mgmt | Route based on physical NIC load | vmnic0, vmnic1 | 1500 (default) |
Management | Route based on physical NIC load | vmnic0, vmnic1 | 1500 (default) |
vMotion | Route based on physical NIC load | vmnic0, vmnic1 | 9000 |
VTEP | Route based on SRC-ID | vmnic0, vmnic1 | 9000 |
vDS-vSAN
Port Group Name | LB Policy | Uplinks | MTU |
vSAN | Route based on physical NIC load | vmnic2, vmnic3 | 9000 |
vDS-PROD-EDGE
Port Group Name | LB Policy | Uplinks | Remarks |
ESG-Uplink-1-vlan-xx | Route based on originating virtual port | vmnic2 | 1500 (default) |
ESG-Uplink-2-vlan-yy | Route based on originating virtual port | vmnic3 | 1500 (default) |
Control Pane and Routing Design
The control plane decouples NSX for vSphere from the physical network and handles the broadcast, unknown unicast, and multicast (BUM) traffic within the logical switches. The control plane is on top of the transport zone and is inherited by all logical switches that are created within it. Distributed Logical Router: distributed logical router (DLR) in NSX for vSphere performs routing operations in the virtualized space (between VMs, on VXLAN backed port groups).
- DLRs are limited to 1,000 logical interfaces. If that limit is reached, you must deploy a new DLR.
Designated Instance: The designated instance is responsible for resolving ARP on a VLAN LIF. There is one designated instance per VLAN LIF. The selection of an ESXi host as a designated instance is performed automatically by the NSX Controller cluster and that information is pushed to all other ESXi hosts. Any ARP requests sent by the distributed logical router on the same subnet are handled by the same ESXi host. In case of an ESXi host failure, the controller selects a new ESXi host as the designated instance and makes that information available to the other ESXi hosts. User World Agent: User World Agent (UWA) is a TCP and SSL client that enables communication between the ESXi hosts and NSX Controller nodes, and the retrieval of information from NSX Manager through interaction with the message bus agent. Edge Services Gateway : While the DLR provides VM-to-VM or east-west routing, the NSX Edge services gateway provides north-south connectivity, by peering with upstream top of rack switches, thereby enabling tenants toaccess public networks. Some Important Design Considerations for EDGE and DLR.
- ESGs that provide ECMP services, which require the firewall to be disabled.
- Deploy a minimum of two NSX Edge services gateways (ESGs) in an ECMP configuration for North-South routing
- Create one or more static routes on ECMP enabled edges for subnets behind the UDLR and DLR with a higher admin cost than the dynamically learned routes.
- Hint: If any new subnets are added behind the UDLR or DLR the routes must be updated on the ECMP edges.
- Graceful Restart maintains the forwarding table which in turn will forward packets to a down neighbor even after the BGP/OSPF timers have expired causing loss of traffic.
- FIX: Disable Graceful Restart on all ECMP Edges.
- Note: Graceful restart should be selected on DLR Control VM as it will help maintain data path even control VM is down. please note DLR control VM is not in Data Path, But EDGE will sit in Data path.
- If the active Logical Router control virtual machine and an ECMP edge reside on the same host and that host fails, a dead path in the routing table appears until the standby Logical Router control virtual machine starts its routing process and updates the routing tables.
- FIX: To avoid this situation create anti-affinity rules and make sure you have enough Hosts to tolerate failures for active / passivce control VM.
Collapsed DMZ Routing Design
Below are the Production design details.
- DLR will act as gateway for Production web, app and DB tier VXLAN’s.
- DLR will peer with EDGE gateways with OSPF , normal area ID 10.
- IP 2 will use as packet forwarding address and protocol address 3 will be in use for route peering with edge in the DLR.
- All 4 edges will be configured with ECMP so that they all will pass the traffic to upstream router and downstream DLR.
- Two SVI’s will be configured on TOR / Nearest L3 device as in my case both are acting as active with VPC and HSRP configured across both the switches.
- EDGE gateways will have two uplinks each towards each SVI from each vLan.
- Static route will be created on EDGE for subnets hosted on DLR with higher admin distance. This will save if any issues with control VM.
Edge Uplink Design
- Each edge will have two uplinks one from each port group.
- each uplink port group will have only one physical uplink configured. No passive uplinks.
- Each uplink port group will be tagged with separate vLan.
Micro Segmentation Design
The NSX Distributed Firewall is used to protect all management applications attached to application virtual networks. To secure the SDDC, only other solutions in the SDDC and approved administration IPs can directly communicate with individual components. NSX micro segmentation will help manage all the firewall policies from single pane.
Deployment Flow and Implementation Guides
NSX deployment flow is given below. If you are looking for detailed vmware NSX installation and configuration guide please follow this post of mine.
Backup and Recovery
Please refer to below vmware articles and docs for backup and restore procedures. Click on the link to redirect to vmware websites. vCenter server Backup and Restore:
- VMWare KB Article : Overview of Backup and Restore options in vCenter Server 6.x
- VMWare KB article : Back up and restore vCenter Server Appliance/vCenter Server 6.0 vPostgres database
- VMWare Docs: File-Based Backup and Restore of vCenter Server Appliance
NSX Manager backup and Restore procedures:
- VMWare Docs: Backing Up and Restoring the NSX Manager
- VMWare Docs :NSX Backup and Restore
- VMWare Docs : Restore an NSX Manager Backup
Hope this Design post is useful. Please leave your comments and feedback which will encourage me to do more 😉 .
in the routing desing say NSSA area and in descrption regular area… btw, why OSPF? doesn’t have much scalability