Inside Arista’s Distributed Etherlink Switch (DES)

Table of Contents
Frequently Asked Questions - FAQs
Introduction: AI Changes Everything
AI isn’t just another application running over the same old data center fabric. The explosive growth of large-scale model training and high-throughput inference has pushed traditional networks past their limits. While GPU and accelerator technologies have advanced at a blistering pace, the network has emerged as the true performance bottleneck for AI workloads.
Arista Networks’ Distributed Etherlink Switch (DES) is engineered to change that equation. By reimagining the data center fabric as a scalable, lossless, and fully scheduled network that behaves like a single giant switch, DES offers the deterministic performance that modern AI clusters demand.
Why Traditional Networks Fall Short for AI
The challenges of networking for AI begin with the nature of the workloads themselves. Unlike enterprise traffic that involves numerous small, bursty flows, AI workloads produce a small set of massive, long-lived “elephant flows.” These flows originate from collective communication operations like All-Reduce and All-to-All, in which every accelerator exchanges state with every other accelerator.
This is fundamentally different from the north-south traffic model that traditional enterprise networks were designed to handle. The AI east-west traffic patterns create extreme lateral data movement across the fabric, and because these flows are synchronized, any delay in one path stalls the entire operation. This is why in AI networking, the metric that matters is not just average latency, but tail latency—the time taken for the slowest packet to arrive.
The High Cost of Tail Latency
In synchronized training jobs, thousands of GPUs work in parallel. A single delay in communication ripples outward, forcing every other node to wait. While average latencies might look acceptable, a brief spike in congestion or packet loss can stall entire training jobs, extending Job Completion Time (JCT) and leaving expensive GPUs idle.
Industry data shows that networking overhead can consume as much as 30-50% of AI workload runtime. In practical terms, this means that for every hour a cluster runs, half the time may be spent simply waiting for data to traverse the network. The cost of this inefficiency is staggering, given the high capital and operational expense of modern accelerator farms.
AI’s Demand for Lossless Transport
In traditional enterprise networking, some packet loss is acceptable. TCP, for example, handles loss by retransmitting dropped packets. AI workloads, however, are built on distributed training libraries that assume a lossless fabric. A single dropped packet might require resending large data sets, or worse, force a full restart of the training job from a checkpoint.
Therefore, networks designed for AI must guarantee zero packet loss under even the harshest traffic conditions. Achieving this requires much more than fast links; it demands deep architectural changes in how traffic is managed and how congestion is handled.
Arista’s Etherlink Strategy: A Unified Fabric
Recognizing these challenges, Arista has built the Etherlink portfolio, a holistic architectural strategy that aims to unify all parts of the AI data center—from front-end inference networks to back-end training fabrics—on a single, standards-based Ethernet infrastructure.
Historically, many organizations turned to proprietary technologies like InfiniBand for AI and HPC back-end networks, while keeping the front-end on Ethernet. This created islands of specialized infrastructure, requiring complex gateways and specialized expertise. Arista’s vision, in contrast, eliminates this divide, allowing organizations to leverage existing Ethernet skillsets, monitoring tools, and operational processes to support AI.
Introducing the Distributed Etherlink Switch (DES)
At the pinnacle of the Etherlink strategy sits the Arista 7700R4 Distributed Etherlink Switch (DES). DES represents a fundamental rethinking of how a network fabric should be built for AI.
While physically composed of multiple interconnected leaf and spine switches, DES operates as a single, logical switch. To connected servers and applications, it appears as one massive, flat network where any node can reach any other node in effectively a single hop.
This is significant because it removes the need for complex routing tables, overlays, or multi-hop architectures that introduce variability and delay. Instead, DES delivers deterministic, predictable performance regardless of cluster size.
How DES Works: Disaggregating the Chassis
Traditional modular switches are constrained by physical backplanes and finite chassis sizes. DES breaks free from these limits by disaggregating the chassis into distributed components.
The system consists of:
7700R4C-38PE Leaf Switches: These connect directly to servers and accelerators.
7720R4-128PE Spine Switches: These form the fabric’s core interconnect.
Physically, these components are linked via high-speed 800G connections in a leaf-spine topology. But logically, the fabric acts as a single switch, with consistent latency and throughput across the entire system.
Core Innovations Powering DES
Virtual Output Queuing (VOQ)
One of DES’s defining technologies is Virtual Output Queuing. Traditional switches can suffer from head-of-line blocking, where congestion on one egress port blocks unrelated traffic. VOQ solves this by creating a separate queue for every possible egress destination. This guarantees that congestion in one area doesn’t spill over to others, maintaining fairness and consistent throughput.
Cell-Based Traffic Spraying
DES uses a technique known as cell-based spraying, breaking packets into uniform, fixed-size cells. These cells are distributed evenly across all available fabric links. Unlike traditional load balancing that relies on hashing packet headers—a method prone to collisions when dealing with AI’s low-entropy flows—cell spraying ensures traffic is perfectly balanced across the fabric. This prevents hotspots and guarantees full utilization of network resources.
Distributed Scheduling
Managing traffic across a distributed fabric requires coordination. DES employs a massively parallel, hardware-based scheduler that operates with an end-to-end view of the network. It uses a credit-based system to ensure cells are only sent when buffer space is available at their destination, preventing packet drops and maintaining lossless performance.
Deep Buffers
DES leaf switches come equipped with substantial buffers,16 GB per device, allowing the network to absorb the sudden bursts of traffic typical of AI workloads. This buffering capacity is critical to maintaining lossless performance during incast scenarios where many devices simultaneously send data to a single endpoint.
Operational Simplicity: The DES Advantage
Beyond pure performance, DES dramatically simplifies operations. Because it appears as a single logical switch:
Routing protocols are greatly simplified or unnecessary.
There’s no need for complex overlay networks to stitch multiple fabrics together.
Uniform cabling reduces costs and power consumption, particularly through the use of short-reach optics.
The result is a fabric that’s not only high performing but also easy to deploy, manage, and scale.
Scaling for Tomorrow’s AI
DES can scale incrementally. For smaller deployments, a simple two-tier architecture is sufficient, supporting over 4,600 hosts at 800G. For hyperscale environments, a third “super-spine” tier enables fabrics supporting more than 27,000 nodes and over 22 petabits per second of bandwidth.
Crucially, DES’s architecture allows for pay-as-you-grow scaling. Organizations can start small and expand as their AI needs increase, without forklift upgrades or disruptive architecture changes.
DES in Action: Meta’s Hyperscale Validation
The real-world validation of DES comes from hyperscale operators like Meta. In a landmark project, Meta built two parallel clusters to train its Llama 3 model: one on InfiniBand and one on Arista’s Ethernet fabric. Meta engineers publicly confirmed that, with careful tuning, the Ethernet fabric matched InfiniBand in performance while maintaining the benefits of an open, standards-based environment.
Such deployments prove that Ethernet, when architected correctly, can meet and exceed the performance demands of the world’s largest AI clusters.
Preparing for the Future: Ultra Ethernet
Arista is also a founding member of the Ultra Ethernet Consortium (UEC), which aims to evolve Ethernet for AI and HPC workloads. The consortium’s goals include:
Designing a modern RDMA transport protocol beyond RoCEv2.
Enhancing congestion control to simplify deployment.
Implementing advanced multipath forwarding to fully utilize fabric bandwidth.
DES hardware is engineered to be forward-compatible with UEC standards, ensuring that today’s investments remain relevant as Ethernet evolves.
Intelligent Visibility: Turning Technology Into Outcomes
Deploying DES is not plug-and-play. Optimizing a high-performance AI fabric requires:
Tuning QoS and congestion management.
Balancing flows for uniform utilization.
Ensuring seamless integration with observability tools like Arista’s CloudVision.
This is where Intelligent Visibility steps in. As a trusted Arista partner, Intelligent Visibility bridges the gap between cutting-edge hardware and real-world results. Our services cover:
Design and deployment of DES-based networks.
Advanced tuning and optimization for AI workloads.
Co-managed services to keep fabrics running at peak efficiency.
By partnering with specialists, enterprises can be confident that their DES investment translates into measurable business value.
Why DES Matters
For data centers entering the AI era, the network has become the new bottleneck—and the new frontier of performance engineering. Arista’s DES offers an architecture that:
Delivers predictable, lossless performance.
Scales to the largest AI clusters in the world.
Simplifies operations while providing future-proof flexibility.
As Meta’s success demonstrates, Ethernet can compete head-to-head with proprietary fabrics like InfiniBand. DES is the key to making that possible.
Arista’s DES fabric isn’t just big—it’s engineered for AI’s relentless scale and precision.
For enterprises serious about AI, DES represents the future of networking.
Frequently Asked Questions
How is DES different from a traditional leaf-spine network?
Traditional leaf-spine networks rely on multi-hop routing, load balancing via ECMP, and often complex overlays to achieve scalability. In contrast, DES operates as a single logical switch—even though it’s physically composed of distributed leaf and spine switches. It uses VOQ, cell-based traffic spraying, and distributed scheduling to deliver deterministic, lossless performance without the operational complexity of multi-tier routing.
Why is cell-based spraying so important for AI traffic?
AI workloads produce a small number of massive flows, which often collide under traditional hash-based load balancing. DES solves this by breaking packets into small, fixed-size cells and distributing them evenly across all fabric paths. This eliminates congestion hotspots, ensures uniform link utilization, and prevents tail latency spikes that can stall AI training jobs.
Can DES handle both AI workloads and general-purpose data center traffic?
Yes. One of DES’s advantages is its flexibility. While it’s optimized for high-performance AI clusters, its deterministic fabric, deep buffering, and high port densities make it suitable for mixed workloads—including storage, traditional compute, and even cloud services. However, its real value shines when handling synchronized, east-west AI traffic.
How does DES ensure lossless operation under heavy loads?
DES uses a combination of Virtual Output Queuing (VOQ) to prevent head-of-line blocking, cell spraying to avoid congestion, deep buffering to absorb microbursts, and distributed hardware scheduling to manage traffic fairly. These mechanisms work together to keep the fabric lossless—even under the intense incast traffic common in AI clusters.
Is DES future-proof given the rise of Ultra Ethernet Consortium (UEC) standards?
Yes. Arista has designed DES hardware to be forward-compatible with UEC specifications as they emerge. While today’s DES runs on RoCEv2 and other Ethernet standards, it’s built to evolve through software updates and integration with next-generation silicon, ensuring your investment is protected as Ethernet adapts to the demands of AI and HPC workloads.