Why is 9000 MTU mandatory for NVMe/TCP in AIM environments?

NVMe/TCP protocol is sensitive to MTU inconsistency. Any hop in the path that drops to 1500 MTU causes fragmentation and retransmission, resulting in severe performance degradation that appears as storage latency spikes in VMs. The 9000 MTU requirement is documented in both Nutanix and Pure deployment guides.

What happens if MLAG configuration fails between Arista leaf switches?

MLAG failure would cause the UCS Fabric Interconnects to see inconsistent LACP behavior, potentially leading to traffic blackholing or suboptimal load balancing. Aegis PM monitors MLAG peer state and alerts if synchronization is degraded.

How do you size uplink bandwidth between UCS FI and Arista leaf?

Uplink sizing is based on VM count multiplied by average I/O per VM plus AHV migration rate. 100G uplinks are recommended for environments with significant storage I/O, as NVMe/TCP can saturate 25G links in dense VM deployments.

Why do Arista switches need deep buffers for AIM environments?

NVMe/TCP storage traffic is bursty. Standard shallow-buffer switches drop packets during microbursts, causing TCP retransmissions that appear as storage latency spikes in VMs. Deep-buffer switches absorb burst traffic without packet loss.

Is this design applicable to hyperconverged Nutanix deployments?

No, this design is specifically for disaggregated AIM environments where compute (UCS), storage (Pure), and hypervisor (Nutanix AHV) are separate components. Hyperconverged Nutanix deployments have different network requirements and topologies.

Why is 9000 MTU mandatory for NVMe/TCP in AIM environments?

NVMe/TCP protocol is sensitive to MTU inconsistency. Any hop in the path that drops to 1500 MTU causes fragmentation and retransmission, resulting in severe performance degradation that appears as storage latency spikes in VMs. The 9000 MTU requirement is documented in both Nutanix and Pure deployment guides.

What happens if MLAG configuration fails between Arista leaf switches?

MLAG failure would cause the UCS Fabric Interconnects to see inconsistent LACP behavior, potentially leading to traffic blackholing or suboptimal load balancing. Aegis PM monitors MLAG peer state and alerts if synchronization is degraded.

How do you size uplink bandwidth between UCS FI and Arista leaf?

Uplink sizing is based on VM count multiplied by average I/O per VM plus AHV migration rate. 100G uplinks are recommended for environments with significant storage I/O, as NVMe/TCP can saturate 25G links in dense VM deployments.

Why do Arista switches need deep buffers for AIM environments?

NVMe/TCP storage traffic is bursty. Standard shallow-buffer switches drop packets during microbursts, causing TCP retransmissions that appear as storage latency spikes in VMs. Deep-buffer switches absorb burst traffic without packet loss.

Is this design applicable to hyperconverged Nutanix deployments?

No, this design is specifically for disaggregated AIM environments where compute (UCS), storage (Pure), and hypervisor (Nutanix AHV) are separate components. Hyperconverged Nutanix deployments have different network requirements and topologies.

UCS Fabric Interconnect + Arista Leaf Network Design for AIM

Q: Can storage traffic share VLANs with VM data traffic?

No, this is a hard requirement for AIM environments. Mixing storage I/O with VM data traffic on a single VLAN creates variable latency and makes storage performance problems difficult to diagnose. Dedicated VLANs for each traffic class are mandatory.

Key Takeaways

AIM environments require dedicated VLANs for NVMe/TCP storage traffic with mandatory 9000 MTU end-to-end — mixing storage I/O with VM data traffic on a single VLAN creates variable latency and makes performance problems difficult to diagnose.
Arista switches in AIM environments must support deep packet buffers because NVMe/TCP storage traffic is bursty — standard shallow-buffer switches drop packets during microbursts, causing TCP retransmissions that appear as storage latency spikes.
The four traffic classes in disaggregated AIM are NVMe/TCP storage traffic, AHV live migration, VM data traffic, and management traffic — each requires specific MTU, QoS, and VLAN configuration.
MLAG configuration between Arista leaf pairs allows UCS Fabric Interconnects to see a single LACP port-channel partner while providing redundancy across both switches and surviving single leaf failures.
End-to-end path validation must include MTU consistency testing with 9000-byte pings, NVMe/TCP session establishment, storage throughput baselining, and MLAG failover testing before any workloads are placed.

The Network Problem in AIM Environments

The network between UCS and Pure is not an afterthought. In a disaggregated AIM environment, the Ethernet fabric between Cisco UCS Fabric Interconnects and Pure FlashArray carries four distinct traffic classes that each have specific requirements for performance and reliability.

These traffic classes are NVMe/TCP storage traffic (the I/O path for every VM's disk reads and writes), AHV live migration traffic (VM movement between hosts), VM data traffic (production workload east-west and north-south), and management traffic (AHV host management, Prism, IPMI/CIMC, Intersight).

When the network between UCS and Pure is misconfigured — wrong MTU, incorrect QoS, inadequate bandwidth, missing VLANs, suboptimal topology — storage performance degrades, VMs fail to migrate, and administrators see symptoms in the application layer that appear to be storage problems but originate in the network path.

The integration gap

This is the integration problem that UCS documentation doesn't fully address (it documents the UCS side), that Pure documentation doesn't fully address (it documents the array side), and that Nutanix documentation handles only at the AHV level. IVI designs and implements the full path with operational ownership of the complete fabric.

VLAN Segmentation Design

AIM environments require dedicated VLANs for storage traffic. This is a hard requirement, not optional. Mixing storage I/O with VM data traffic on a single VLAN creates variable latency and makes storage performance problems difficult to diagnose.

MTU consistency requirement

The 9000 MTU for NVMe/TCP is a hard requirement documented in both Nutanix and Pure deployment guides. The MTU must be consistent end-to-end: UCS vNIC MTU, UCS FI uplink MTU, Arista leaf port MTU (access or trunk), and Pure FlashArray NVMe/TCP interface MTU must all be set to 9000.

Management VLAN

Typically deployed in the 10.x.x.x management subnet, the management VLAN carries AHV host management, Prism Element, IPMI/CIMC, Intersight Device Connector, and Pure management interface traffic. This VLAN uses standard 1500 MTU, standard low-priority QoS, and routes through the UCS FI to Arista leaf to L3 gateway to reach the management network.

NVMe/TCP Storage VLAN

The storage VLAN is dedicated and isolated, carrying only NVMe/TCP initiator-to-target traffic between UCS compute nodes and Pure FlashArray NVMe/TCP target ports. This VLAN requires 9000 MTU as a mandatory configuration — NVMe/TCP protocol is sensitive to MTU inconsistency, and any hop in the path that drops to 1500 MTU causes fragmentation, retransmission, and severe performance degradation.

QoS configuration should use high priority with DSCP CS4 or EF marking honored through FI, Arista leaf, to Pure target ports. The storage VLAN should not carry any non-storage traffic and typically has no L3 routing between the storage VLAN and other VLANs, since storage communication is host-to-array only. If L3 routing is required, ensure jumbo frames are honored at the routed interface.

AHV Live Migration VLAN

This VLAN carries AHV live migration traffic for VM memory copy between hosts. The recommended MTU is 9000, as large memory pages during migration benefit from larger frames, though some environments use 1500 with acceptable performance for moderate migration rates. QoS should be medium priority, below storage but above best-effort VM traffic. Live migration can be rate-limited on the Arista side to prevent saturating storage or VM data VLANs.

VM Data VLANs

VM data VLANs carry production VM traffic, typically implemented as multiple VLANs — one per application or network segment, trunked through the infrastructure. These VLANs use standard 1500 MTU unless guest OS requires jumbo frames, which is uncommon for VM workloads versus storage. QoS is typically best-effort or workload-specific marking, and the path runs from UCS trunk port per service profile template through Arista leaf trunk port.

UCS Fabric Interconnect Configuration

UCS Fabric Interconnect configuration for AIM environments requires specific attention to model selection, uplink design, high availability, and server-facing configuration to handle the four traffic classes effectively.

FI Model Selection

The UCS 6536 FI is the current generation recommendation for X-Series environments, providing 40/100G server-facing ports and 100G uplink ports. The UCS 6454 FI is the previous generation that is still deployed in existing environments, offering 10/25G server-facing ports and 40/100G uplinks.

Uplink Design to Arista

Each FI connects to a dedicated Arista leaf switch pair, with FI-A connecting to the Arista Leaf-A pair and FI-B connecting to the Arista Leaf-B pair. Uplinks use port-channel with LACP from FI to Arista leaf for bandwidth aggregation and link-level redundancy.

Uplink speed should be 25G or 100G depending on environment scale, with 100G recommended for environments with significant storage I/O since NVMe/TCP can saturate 25G links in dense VM deployments. The number of uplinks per FI is typically 2-4 per Arista leaf for aggregation, scaled based on VM count multiplied by average I/O per VM plus AHV migration rate.

High Availability Design

Dual FI design is mandatory for production AIM environments, as single FI creates a single point of failure for all VMs on that UCS domain. FI-A and FI-B provide redundant I/O paths for each server, with each UCS server having vNICs with primary path on FI-A and secondary on FI-B, balanced across the FI pair. FI failover is transparent to AHV and Nutanix, as the Cisco VIC adapter handles path switching at the hardware level.

Server-facing Configuration

Server-facing ports typically run at 10G or 25G speed on the FI. UCS VIC adapters (Cisco VIC 1400/1600 series) provide vNIC virtualization with multiple vNICs from a single physical adapter, each mapped to a specific VLAN and QoS policy. For AIM compute-only nodes, typically four vNICs minimum are configured for management, storage, migration, and VM data, though migration may be combined with VM data if separate VLAN is not warranted.

Arista Leaf Configuration for AIM

Arista switches in AIM environments must support deep packet buffers because NVMe/TCP storage traffic is bursty. Standard shallow-buffer switches drop packets during microbursts, and dropped packets on storage paths cause TCP retransmissions that appear as storage latency spikes in VMs. Deep-buffer switches absorb burst without loss.

Switch Selection

Recommended Arista platforms for AIM leaf position include the Arista 7050X3 series with standard buffer acceptable for environments with moderate storage I/O, the Arista 7060X5 series with larger buffer preferred for high-density NVMe/TCP environments or significant storage IOPS requirements, and the Arista 7280R3 series with very large buffer for highest-demand environments.

Arista Port Configuration for FI Uplinks

Port configuration connecting to UCS FI uses switchport mode trunk to carry all AIM VLANs, with MTU set to 9214 on trunk port (higher than jumbo 9000 to accommodate Ethernet overhead). Spanning Tree should be configured as portfast trunk since FI uplinks should not go through STP listening/learning states. LLDP should be enabled to allow Arista to learn FI as neighbor for topology validation.

MLAG Design for FI Uplinks

When each FI connects to a pair of Arista leaf switches for redundancy, the Arista pair uses MLAG (Multi-chassis Link Aggregation) to appear as a single logical switch to the FI while providing redundancy. Two Arista leaf switches are configured as MLAG peers connected via a dedicated MLAG peer link. The FI sees a single LACP port-channel partner (the MLAG pair), resulting in FI uplinks having bandwidth across both Arista switches while surviving single Arista leaf failure and single uplink failure.

Arista QoS for Storage Traffic

QoS configuration should trust DSCP on ports receiving FI uplinks using Arista "trust dscp" policy. Storage-class DSCP markings (CS4 or EF from vNIC QoS policy) must be honored through the Arista data path into the Pure FlashArray target ports. Queue mapping should route storage DSCP to high-priority queue, typically queue 5 or 7 depending on Arista QoS profile.

Pure FlashArray Port Configuration on Arista

Pure FlashArray NVMe/TCP ports connect directly to Arista leaf switches using access port or trunk in storage VLAN, typically access port if storage VLAN is dedicated. MTU should be set to 9214 consistent with FI uplink MTU. Spanning Tree should be configured as portfast since Pure is an edge port, not a bridge. QoS should trust DSCP from Pure NVMe/TCP initiator for symmetric QoS treatment.

End-to-End Path Validation

Before any Nutanix workloads are placed on the AIM environment, IVI validates the complete path to ensure proper configuration and performance. This validation process covers MTU consistency, NVMe/TCP connectivity, storage performance, redundancy, and live migration functionality.

MTU Consistency Testing

Send 9000-byte ping (ICMP with DF bit set) from UCS compute node through Arista to Pure FlashArray target IP. A passing test shows ping succeeds without fragmentation. A failing test indicates any intermediate hop at 1500 MTU that must be identified and corrected.

NVMe/TCP Path Discovery and Session Establishment

All Pure NVMe/TCP targets must be discovered and connected from all compute nodes before VMs are placed on the environment. This ensures the storage fabric is fully functional before production workloads depend on it.

Storage Throughput Baseline

Run IOMeter or FIO benchmark from AHV host directly, before Nutanix AOS is using the path, to establish baseline IOPS and latency. This baseline becomes the reference for storage performance alerting thresholds in Aegis PM monitoring.

MLAG/Port-channel Failover Test

Pull one physical uplink from each FI to test redundancy. Storage traffic should continue, latency spike should be transient (less than 100ms), and after recovery, traffic should rebalance properly across available links.

AHV Live Migration Test

Migrate a test VM across hosts while monitoring storage I/O to confirm migration does not cause storage latency degradation. This validates that migration traffic and storage traffic are properly segmented and prioritized.

Ongoing Monitoring with Aegis PM

Aegis PM provides operational visibility for the fabric layer with specific monitoring designed for AIM environments. The monitoring covers FI uplink utilization with alerts before saturation (typically 70% threshold), Arista port error rates on FI-facing and Pure-facing ports, and storage VLAN presence on all expected trunk ports using custom modules.

Additional monitoring includes deep buffer utilization on storage-path Arista ports using custom eAPI modules, MLAG peer state with alerts if MLAG sync is degraded, and NVMe/TCP session count cross-platform monitoring comparing Pure sessions versus expected host count.

Reference documentation

See AIM-01 for full Arista and UCS FI monitoring configuration detail and specific implementation guidance for Aegis PM in AIM environments.

Who This Service Is For

This network design service is specifically relevant for AIM clients building new disaggregated environments (Nutanix AHV + Pure + UCS) who need the network layer designed and validated as part of the overall engagement — not handed to a separate network team to figure out independently.

The service addresses organizations where the network team and the compute/storage team are separate, and nobody has operational ownership of the intersection between these domains. It also serves existing AIM environments experiencing unexplained storage performance variability, which is often caused by MTU or QoS misconfiguration at this layer.

This is not a standalone service for general Arista data center fabric design (that's a separate IVI capability), SD-WAN or WAN connectivity (completely separate scope), or campus networking (separate IVI practice). This service is specifically the DC fabric design scoped to the UCS FI ↔ Arista leaf interface for AIM storage and compute traffic.

Related Resources

Data Center and Hybrid Cloud Networking → Storage Network Migration: Fibre Channel to Ethernet → Network and Infrastructure Observability → Aegis Performance Monitoring and Observability → Advanced Network Services →