Dell AMD Instinct Series GPU Cluster with Dell Networking

Accelerate innovation. With AMD Instinct GPUs and Broadcom Ethernet, the Dell PowerEdge XE9680 powers faster model training, efficient scaling, and simplified operations for AI at scale.

Infrastructure Performance at Scale

The combination of AMD’s latest Instinct Series accelerators with Dell’s enterprise-grade infrastructure represents a compelling alternative to traditional GPU architectures, particularly for organizations prioritizing cost optimization. This testing initiative evaluates the Dell PowerEdge XE9680 platform equipped with AMD Instinct Series GPUs, interconnected through Broadcom Thor2 network controllers and Dell PowerSwitch fabric infrastructure.


Our testing across single GPU, single node, and multi-node configurations up to 8 nodes (64 GPUs) demonstrates AMD Instinct Series accelerators deliver competitive AI performance while providing significant cost advantages. The integration of Broadcom Thor2 NICs and Dell PowerSwitch Z9864F switches powered by Broadcom Tomahawk 5 ASICs transforms Ethernet into a high-performance GPU interconnect. With hardware offloads for collective operations, congestion-aware traffic shaping, and near-line-rate bandwidth efficiency, Broadcom technologies eliminate networking bottlenecks and establish a fabric that scales predictably as clusters grow.


Importantly, this solution leverages Ethernet as the GPU interconnect fabric. Unlike proprietary alternatives, Ethernet provides an open, standards-based path that accelerates deployment, reduces operational complexity, and ensures alignment with existing enterprise infrastructure. Its ubiquity directly supports total cost of ownership benefits by lowering acquisition costs, streamlining maintenance, and simplifying staff training. Combined with Broadcom’s silicon roadmap to 800GbE and beyond, Ethernet ensures these clusters remain future-proof for evolving inference and training AI workloads.


Beyond compute and interconnect, Ethernet also underpins the storage fabric in these deployments, providing high-performance, easily shared access to training and inference datasets. By standardizing on Ethernet for both GPU interconnectivity and storage area connectivity, organizations eliminate the need for separate storage networks, reducing complexity and operational overhead. Broadcom’s advanced Ethernet technologies ensure predictable throughput and low-latency access to parallel file systems, enabling data to flow seamlessly across compute nodes and storage servers. This shared, lossless fabric simplifies management, accelerates deployment of new storage resources, and ensures that storage scaling aligns naturally with compute scaling, all while leveraging the same Ethernet expertise, monitoring, and operational tooling already in place.

Additional Resources

Research commissioned by:
Dell Technologies logo