Study Guide: Chapter 11 - Data Center Network Design

Pre-Study Assessment

1. Why has the spine-leaf architecture replaced the traditional three-tier data center design?

It reduces the total number of switches needed

It provides predictable single-hop latency and ECMP paths optimized for east-west traffic

It eliminates the need for any routing protocols

It supports only north-south traffic patterns more efficiently

2. What is the primary advantage of EVPN over flood-and-learn in a VXLAN fabric?

EVPN reduces the VXLAN encapsulation overhead from 50 bytes to 20 bytes

EVPN replaces BGP with a simpler OSPF-based control plane

EVPN distributes MAC/IP reachability via MP-BGP, eliminating inefficient flooding

EVPN allows VXLAN to operate without any underlay network

3. Why is symmetric IRB preferred over asymmetric IRB in large VXLAN EVPN fabrics?

Symmetric IRB is faster because it skips the routing step entirely

Symmetric IRB uses a transit L3 VNI so each leaf only needs locally attached VLANs, improving scalability

Symmetric IRB eliminates the need for VRFs in the fabric

Symmetric IRB requires fewer spine switches in the topology

4. When should you choose ACI Multi-Site over ACI Multi-Pod?

When you need seamless L2 extension within a metro area

When a single APIC cluster is sufficient for management

When strict fault domain isolation and geographic distance require independent APIC clusters per site

When the deployment has fewer than 200 leaf switches total

5. What is the role of DWDM in a data center interconnect design?

It replaces VXLAN as the overlay encapsulation protocol

It provides Layer 1 optical transport, multiplexing multiple wavelengths on a single fiber pair

It provides the control plane for MAC learning between sites

It is an alternative to spine-leaf for intra-DC connectivity

6. What is the biggest risk of extending Layer 2 between data centers without proper mitigation?

Increased north-south bandwidth consumption

A failure in one site (broadcast storm, STP miscalculation) can propagate and take down all connected sites

Loss of VXLAN encapsulation capability

Inability to use OSPF as the underlay routing protocol

7. In an active-active data center design, what prevents traffic from hairpinning across the DCI link when a VM migrates?

Static routes pointing to the nearest data center

Distributed anycast gateways with the same virtual IP and MAC at both sites

Disabling all L2 extension between sites

Using OTV instead of VXLAN for encapsulation

8. Why must FCoE traffic receive special handling in a VXLAN EVPN fabric?

FCoE uses a different UDP port than VXLAN

Fibre Channel demands lossless transport, requiring PFC and DCB with CoS-to-DSCP mapping across the routed fabric

FCoE is incompatible with spine-leaf topologies

FCoE requires dedicated spine switches separate from data traffic

9. Why is the one-arm routed model preferred for load balancer placement in EVPN fabrics?

It allows the load balancer to inspect all Layer 2 headers

It aligns with the L3-everywhere philosophy and enables optimal ECMP paths without L2 dependencies

It requires no IP address configuration on the load balancer

It eliminates the need for GSLB in multi-site deployments

10. What is the maximum RTT latency supported by ACI Multi-Pod between pods?

10 ms

50 ms

150 ms

500 ms

11. What is a key advantage of OTV over raw VLAN trunking for DCI?

OTV provides higher bandwidth than VLAN trunks

OTV natively isolates broadcast/flooding and prevents L2 loops between sites

OTV is a multi-vendor standard supported by all switch platforms

OTV eliminates the need for any IP connectivity between sites

12. Which EVPN route type is considered the workhorse for advertising host MAC and IP between VTEPs?

Type 1 (Ethernet Auto-Discovery)

Type 2 (MAC/IP Advertisement)

Type 3 (Inclusive Multicast Tag)

Type 5 (IP Prefix)

13. Why should all leaf-to-spine links in a Clos fabric use the same link speed?

Mixed speeds require additional spine switches

Equal link speeds enable proper ECMP load balancing; mismatched speeds cause uneven traffic distribution

BGP cannot advertise routes over links with different speeds

Mixed speeds prevent VXLAN encapsulation from functioning

14. What is the recommended underlay MTU for a VXLAN fabric, and why?

1500 bytes, the standard Ethernet MTU

9198 bytes (jumbo), to accommodate the 50-byte VXLAN overhead without fragmentation

4096 bytes, matching the maximum VLAN count

16000 bytes, to support the maximum VNI addressing space

15. In a combined ACI deployment, what is the common pattern for using Multi-Pod and Multi-Site together?

Multi-Site within a campus, Multi-Pod across regions

Multi-Pod within a metro area for unified management, Multi-Site across regions for fault isolation

Multi-Pod for storage traffic only, Multi-Site for compute traffic

They cannot be combined in the same deployment

11.1 Data Center Fabric Architecture

11.1.1 Spine-Leaf Topology Design and Scaling

The modern data center has shifted from the traditional three-tier design (access, aggregation, core) to the spine-leaf fabric architecture. This shift is driven by the explosion of east-west traffic from virtualization, microservices, and distributed storage. The spine-leaf design, rooted in Clos network theory from 1953, provides many parallel paths between any two endpoints.

The design consists of two layers: spine switches (the high-speed backbone) and leaf switches (host connectivity at the edge). The fundamental rules are strict: leaf switches connect only to spines, spines connect only to leaves. Every payload traverses exactly one spine hop, producing consistent and predictable latency.

graph TD S1["Spine 1"] S2["Spine 2"] S3["Spine 3"] L1["Leaf 1"] L2["Leaf 2"] L3["Leaf 3"] L4["Leaf 4"] H1["Servers"] H2["Servers"] H3["Servers"] H4["Servers"] S1 --- L1 S1 --- L2 S1 --- L3 S1 --- L4 S2 --- L1 S2 --- L2 S2 --- L3 S2 --- L4 S3 --- L1 S3 --- L2 S3 --- L3 S3 --- L4 L1 --- H1 L2 --- H2 L3 --- H3 L4 --- H4

Figure 11.1: Spine-Leaf Fabric Topology -- every leaf connects to every spine, providing ECMP paths and predictable single-hop latency

Animation: Traffic flow through a spine-leaf fabric showing ECMP path selection and linear scaling as new leaves/spines are added

Scaling: Add leaf switches for more server ports (each new leaf connects to every spine). Add spine switches for more bandwidth per path (every leaf-to-leaf path gains an ECMP path). All leaf-to-spine links must use the same speed for proper ECMP load balancing.

Underlay design: OSPF (single area, point-to-point interfaces) or eBGP (unique AS per device for fault isolation). Best practices include unnumbered interfaces and jumbo MTU (9198 bytes) to accommodate the 50-byte VXLAN overhead.

11.1.2 VXLAN EVPN Fabric Design

VXLAN encapsulates Layer 2 Ethernet frames inside Layer 3 UDP packets (port 4789), allowing L2 segments to stretch across the routed underlay. Each virtual network is identified by a 24-bit VNI, supporting approximately 16 million segments (vs. 4,096 VLANs in 802.1Q). VTEPs on leaf switches handle encapsulation/decapsulation, adding approximately 50 bytes of overhead.

graph TD subgraph "EVPN Control Plane" BGP["MP-BGP EVPN Route Reflector"] end subgraph "Spine Layer" SP1["Spine 1 - IP Underlay"] SP2["Spine 2 - IP Underlay"] end subgraph "Leaf / VTEP Layer" V1["Leaf 1 / VTEP 1 - VNI 10001, 10002"] V2["Leaf 2 / VTEP 2 - VNI 10001"] V3["Leaf 3 / VTEP 3 - VNI 10002"] end subgraph "Hosts" SRV1["Server A - VNI 10001"] SRV2["Server B - VNI 10001"] SRV3["Server C - VNI 10002"] end BGP -. "MAC/IP routes" .-> V1 BGP -. "MAC/IP routes" .-> V2 BGP -. "MAC/IP routes" .-> V3 SP1 --- V1 SP1 --- V2 SP1 --- V3 SP2 --- V1 SP2 --- V2 SP2 --- V3 V1 --- SRV1 V2 --- SRV2 V3 --- SRV3

Figure 11.2: VXLAN EVPN Fabric -- MP-BGP distributes MAC/IP reachability between VTEPs across the routed spine-leaf underlay

Animation: EVPN Type 2 route advertisement flow -- host connects to leaf, MAC/IP advertised via MP-BGP to all VTEPs, eliminating flood-and-learn

EVPN Route Types: Type 1 (Ethernet Auto-Discovery) for multi-homing and fast convergence. Type 2 (MAC/IP Advertisement) -- the workhorse, carrying MAC, IP, and VNI info. Type 3 (Inclusive Multicast Tag) for BUM flooding trees. Type 5 (IP Prefix) for external routes into the fabric.

Asymmetric vs. Symmetric IRB: Asymmetric IRB requires every VLAN/VNI on every leaf (poor scalability). Symmetric IRB uses a transit L3 VNI per tenant VRF -- each leaf only needs locally attached VLANs. Symmetric is the recommended and only model supporting Type 5 routes.

Topology Models: Bridged Overlay (entry-level, no inter-VLAN routing). Centralized Route Bridging (routing on spines, cost-sensitive). Edge Route Bridging (ERB) -- recommended for most deployments, distributes forwarding to leaf switches.

11.2 Data Center Interconnect

11.2.1 DCI Options: Dark Fiber, DWDM, OTV, VXLAN

DCI technologies evolved through three generations:

Generation 1 (Pre-2008): Raw VLAN extension via L2 trunks, QinQ, or EoMPLS. Suffered from STP dependencies, single points of failure, and the full risk of a stretched L2 domain.

Generation 2 (2008+) -- OTV: MAC-in-IP encapsulation (42-byte header) with built-in IS-IS control plane. Native flood isolation, loop prevention, and multi-homing. Cisco proprietary, max 12 sites.

Generation 3 (2014+) -- VXLAN EVPN DCI: Extends the full fabric paradigm between sites. Standards-based (RFC 7348), control-plane MAC learning via MP-BGP, 16M VNI segments. Always pair VXLAN DCI with EVPN -- never deploy flood-and-learn across a WAN.

flowchart LR subgraph DC1["Data Center 1"] L1["Leaf / VTEP Border"] S1["Spine"] L1a["Leaf Compute"] end subgraph Transport["DCI Transport"] DWDM1["DWDM Mux"] DF["Dark Fiber"] DWDM2["DWDM Mux"] end subgraph DC2["Data Center 2"] L2["Leaf / VTEP Border"] S2["Spine"] L2a["Leaf Compute"] end L1a --- S1 --- L1 L1 --- DWDM1 --- DF --- DWDM2 DWDM2 --- L2 L2 --- S2 --- L2a

Figure 11.4: DCI Architecture -- VXLAN EVPN overlay rides DWDM wavelengths over dark fiber

DWDM operates at Layer 1, multiplexing multiple optical wavelengths onto a single fiber pair. It is the transport underlay upon which overlay technologies ride -- not an alternative to OTV or VXLAN.

11.2.3 Active-Active vs. Active-Standby Data Center Design

Active-Standby: One DC handles production; the second is warm/cold standby. Simpler DCI, but the standby site is underutilized and failover takes minutes to hours.

Active-Active: Both DCs serve production simultaneously. Requires L2 extension or distributed anycast gateways, distributed default gateways, GSLB, and synchronized state for stateful services. EVPN provides active-active multihoming, MAC mobility tracking, and mass MAC withdrawal for sub-second convergence.

flowchart LR GSLB["Global Server Load Balancer DNS"] subgraph SiteA["Site A -- Active"] GWA["Anycast Gateway VIP: 10.1.1.1"] LBA["Local ADC"] SVRA["App Servers"] GWA --- LBA --- SVRA end subgraph DCI["DCI Link"] EVPN_DCI["VXLAN EVPN MAC Mobility"] end subgraph SiteB["Site B -- Active"] GWB["Anycast Gateway VIP: 10.1.1.1"] LBB["Local ADC"] SVRB["App Servers"] GWB --- LBB --- SVRB end GSLB -.-> SiteA GSLB -.-> SiteB SiteA --- EVPN_DCI --- SiteB

Figure 11.5: Active-Active Data Center Design with distributed anycast gateways and GSLB

Animation: Active-active failover sequence -- site failure triggers mass MAC withdrawal, GSLB redirects traffic, anycast gateway at surviving site handles all requests

11.3 Data Center Services Design

11.3.1 Load Balancing and Application Delivery

One-arm (routed): ADC connects to a single leaf; traffic is source-NAT'd. Simple, no L2 dependency, but may hide client IP.

Two-arm (inline): ADC bridges between client/server VLANs. Full visibility but creates L2 dependency and potential bottleneck.

DSR (Direct Server Return): ADC handles inbound only; servers respond directly. High throughput but complex troubleshooting.

In EVPN fabrics, one-arm routed is preferred -- it aligns with L3-everywhere and enables ECMP. For multi-site, GSLB at the DNS layer directs users to the optimal site.

11.3.2 Storage Network Integration (FCoE, iSCSI)

FCoE: Encapsulates Fibre Channel in Ethernet. Requires lossless transport via Priority Flow Control (PFC) and Data Center Bridging (DCB). In VXLAN fabrics, CoS must be mapped to DSCP at leaf ingress. High complexity but lowest latency.

iSCSI: Transports SCSI over standard TCP/IP. Natively compatible with any IP network including VXLAN. Simpler and cheaper, but higher latency due to TCP overhead. QoS marking still recommended.

Guidance: For new VXLAN EVPN deployments, iSCSI is often simpler and more cost-effective. FCoE remains relevant for existing FC investments or lowest-latency requirements.

11.3.3 Compute and Network Convergence

Converged/HCI environments collapse compute, storage, and networking into integrated nodes. Key design considerations: bandwidth planning for aggregate traffic on shared links, QoS segmentation across traffic types, VRF-based multi-tenancy with L3 VNIs, and automation (Ansible, Terraform, APIC) for fabric-wide consistency.

Post-Study Assessment