How PCI Express Gives AI Accelerators a Super-Fast Jolt of Throughput

Madhumita Sanyal

Sep 08, 2023 / 5 min read

Every time you get a purchase recommendation from an e-commerce site, receive real-time traffic updates from your highly automated vehicle, or play an online video game, you’re benefiting from artificial intelligence (AI) accelerators. A high-performance parallel computation machine, an AI accelerator is designed to efficiently process AI workloads like neural networks—and deliver near-real-time insights that enable an array of applications.

For an AI accelerator to do its job effectively, data that moves between it (as a device) and CPUs and GPUs (the hosts) must do so swiftly and with very little latency. A key to making this happen? The PCI Express® (PCIe®) high-speed interface.

With every generation, made available roughly every three years, PCIe delivers double the bandwidth—just what our data-driven digital world demands. The latest version of the specification, PCIe 6.0, provides:

  • 64 GT/s per pin data transfer rate
  • A new low-power state for greater power efficiency
  • Cost-effective performance
  • High-performance integrity and data encryption (IDE)
  • Backwards compatibility to previous generations

While PCIe might traditionally be associated with slots on PCs that enable connectivity with peripheral devices such as graphics cards and scanners, it is so much more thanks to its increasing bandwidth. Read on to learn more about how PCIe supports the demanding requirements of AI accelerators.

ai accelerator pcie

AI Is Everywhere—and So Is PCIe!

AI is fast becoming pervasive in chipsets, with more than 40% of chipsets projected to include AI hardware by 2030, according to GlobalData. The complexity of AI and machine learning (ML) workloads continues to grow. In fact, AI and ML training models double in size roughly every few months. To be effective, AI systems must be able to move large datasets through the AI development pipelines without sacrificing performance or latency. Consider these examples of bandwidth-intensive workloads:

  • High-definition 4K and 8K video that requires more compute and memory
  • High resolution and high dynamic range, which enable machine vision and real-time perception
  • Multiple camera arrays and 4D sensing, which enable depth and motion inference

All of these trends point to the criticality of AI accelerators to deliver the parallel computation prowess needed for near-instantaneous responses by applications such as voice activation and highly automated vehicles. These high-performance machines can either take the form of a very large chip, such as Cerebras’ Wafer-Scale Engine (WSE) for deep-learning systems. Or, they can be GPUs, massively multi-core scalar processors, or spatial accelerators. The latter are individual chips that can be combined by the tens to hundreds, creating larger systems capable of processing large neural networks that often require hundreds of petaFLOPS of processing power. 

how pcie connects ai accelerators

PCIe Bridges the Gap

With their ability to handle AI and ML workloads, AI accelerators enhance the processing power of CPUs in data center servers, with PCIe acting as a bridge between the two. In its role, PCIe provides a number of benefits:

  • Maximize bandwidth as a chip-to-chip interface, whether for AI accelerators in massive compute arrays or at the edge
  • Provide expanded capacity to move data between multiple hosts and multiple devices, as PCIe slots can accommodate various types of expansion cards, including AI accelerators
  • Support parallel processing of workloads across multiple chips via multi-threading
  • Enable universal interoperability between hosts and devices, which enables seamless addition or removal of AI accelerator cards while the system is running
  • Minimize carbon footprints through power-efficient PCIe 6.0 L0p Mode, which enables traffic to run on a reduced number of lanes to lower power
  • Provide data confidentiality, integrity, and replay protection to ensure that data on the wire is secure from observation, tampering, deletion, insertion, or replay of packets

Silicon-proven PCIe physical layer (PHY) and controller IP with IDE security are key to reaping the full benefits of the secure high-speed interface, as is expertise to help facilitate your designs. Power and signal integrity considerations illustrate the importance of expert support. AI acceleration generally needs many high-speed lanes. Multiple lanes of PCIe switching simultaneously triggers a huge amount of power, making power integrity a concern. If issues such as IR-drop occur during simultaneous switching, this inhibits full performance. Signal integrity is important as well, as the signal transmitted between the AI accelerator and the CPU in a system must be intact. Synopsys has power and signal integrity experts in house who work to mimic multiple-lane situations for insights to guide customers on where to place the PHY supporting PCIe as they design their chips and, ultimately, to achieve optimal performance.

first pass silicon success pcie

Synopsys, whose PCIe experts are also key contributors to the PCI-SIG consortium that writes the specifications for the PCIe buses, is an industry leader in PCIe IP and PCIe protocol verification aolutions (including Verification IPs). Our PCIe portfolio, with components featuring backwards compatibility, includes:

Beyond our end-to-end PCIe IP solution, our IP portfolio also includes memory, processor, and other interface IP that are beneficial for AI accelerators. Synopsys.ai™, our full-stack, AI-driven electronic design automation (EDA) suite, includes functions that can dramatically speed the design of specialized AI accelerators. On the verification side, AI SoCs requires faster hardware-assisted verification solutions at pre-silicon. Synopsys ZeBu Server 5 and HAPS® systems provide the industry’s fastest and highest capacity hardware-assisted verification (HAV) systems facilitating all system-level validation use cases for complex SoC designs. 

Where Does PCIe Go from Here?

The next generation of PCIe is expected to move into the blazingly fast 2.048 TB realm. As AI is integrated into more devices and systems, anything that can feed its need for speed is welcome news. For AI accelerators today and in the future, the continually evolving PCIe high-speed interface looks to be a ready partner to bring greater intelligence into our everyday lives.

Continue Reading