by Stephen Peltan, R&D Sr. Staff Engineer - Synopsys
PCI Express, the next generation of the PCI bus, is being widely adopted in today’s high-performance PCs, servers and embedded applications. This high bandwidth protocol keeps the same software interface and many of the key features of PCI, but has a number of differences and new features. The biggest changes with PCI Express are the use of serial data transfers and gigahertz clock speeds, making the protocol more complex, but providing significant improvement in data throughput.
This application note provides you with a very brief introduction to the emerging PCI Express protocol and explains how selecting the right digital and mixed signal IP can accelerate the implemention of this new standard into your designs. PCI Express Overview
PCI Express provides low pin count, high reliability, high speed data transfer at a rate of 2.5 GBits per second and up, for serial links on backplanes and printed wiring boards. PCI Express System Example - PC Motherboard Based System
In this example of a PCI Express system, the dashed lines represent PCI Express links, the purple boxes indicate plug-in cards, and the other boxes are components found on a system card.
The black boxes represent PCI Express IP, which is comprised of a digital component and a mixed signal component.
The digital portion may implement one or more of the following:
- Root Complex Port (RC) initializes and manages the PCI Express fabric
- Switch Port (SW) routes data between multiple PCI Express links
- Endpoint Port (EP) are associated with I/O devices and terminate a PCI express hierarchy
The mixed signal block is:
- PHY which performs analog /digital conversion as well serialization/deserialization
You can choose the bandwidth of each PCI Express link by selecting the number of lanes.
The PCI Express IP handles link initialization, error recovery, power management, data buffering, etc. PCI Express System Example - Chip to Chip System
In this example, all circuits are on a single printed wiring board. The “complex endpoint” chip include two independent PCI Express links. The “endpoint / root” chip includes an additional type of digital PCI Express IP, DM (dual mode) which can operate as a root port, as shown here. It can also operate as an endpoint, for example, when plugged into a PC motherboard slot. PCI Express Protocol Stack
The base PCI Express protocol stack is common to all of the IP shown in the examples. PCI Express IP hides most of the complexity from your application logic. See the following diagram. PCI Express Links
As shown in the diagram below, each PCI Express link:
- is dual unidirectional with no sideband signals
- is serial, with differential signaling
- includes embedded clocking
- operates at a scalable frequency (2.5 Gb/sec. initially)
- can be implemented in low voltage, silicon at 0.25u, and beyond
Data is transferred in packets which include an address, and a variable size data payload.
Reliability is provided by character checks, format checks, CRC code, automatic retransmission in case of error, and exchange of buffer space information, in the form of credits.
A handshake protocol to power down an idle link is included, as well as messages to handle interrupts, error reporting, and hot-plug events.
Configuration registers are available to customize link behavior.
Wide ports automatically configure to narrower ports, as required.
Quality of service and isochronous traffic are supported via optional virtual channels. Digital IP to PHY Interface (PIPE)
Transmit and receive data, as well as status and control, are transferred between the digital and mixed signal IP on the PIPE interface. There are two standard options for transferring data across the PIPE interface:
Selecting PCI Express IP for Your Application
- One byte per clock cycle per PCI Express lane. In this case, the interface and the digital IP operate at 250 MHz, or
- Two bytes per clock cycle per PCI Express lane. In this case, the interface and the digital IP operate at 125 MHz.
To add a PCI Express port to your chip you must:
Selecting the Lane Width
- select the lane width
- select mixed signal IP including PIPE interface frequency and optional features
- select digital IP type (RC, EP, etc.) and optional features
An N lane PCI Express port provides N x 2.5 GBits of raw throughput in each direction. Because of 8b10b encoding, packet payload size, and link overhead, the actual throughput varies. As an example, you can use the following table.
|TABLE 1-1: PCI Express Link Throughout Examples |
| ||Real Throughput Examples (GBits per Second) |
--------------Packet Payload Size in Bytes--------------
|Lane Width ||16 ||128 ||256 ||4096 |
|x1 ||0.5 ||1.7 ||1.7 ||2.0 |
|x4 ||2.0 ||6.8 ||7.0 ||7.8 |
|x8 ||4.0 ||13.7 ||14.0 ||15.7 |
|x16 ||7.9 ||27.4 ||28.0 ||31.4 |
Remember that these are only reasonable examples; another PCI Express application note will explain in detail how to calculate these numbers for your application. Selecting Mixed Signal PCI Express IP (PHY)
Mixed signal IP is generally sold as a “hard macro”, which is tailored to your chip manufacturer’s process.
Both the PCI Express and PIPE interfaces are standard. You can usually choose:
- the number of PCI Express lanes
- either a one byte or two byte PIPE interface. As explained above, you will determine the PIPE interface and digital IP operating frequency when you make this selection.
In addition to these standard feature, your PHY vendor may offer additional features:
- lower cost ICs via smaller die size (some PHYs are 50% smaller than others)
- better performance margin (some have twice the sensitivity and jitter margin)
- yield and reliability
- built-in diagnostics and system test
For example, the Synopsys PCI Express PHYs includes built in, unique diagnostics which provide on-chip visibility into the actual performance of your 2.5 GBit per second links. The diagram at the right shows actual scope data from the Synopsys PHY. Selecting Digital PCI Express IP Types
Digital IP for Basic Applications
Based on your application, operating frequency and lane width, you may select PCI Express digital IP optimized for your application. Here are some examples of the range of Synopsys endpoint (EP), switch port (SW), root port (RC), and dual mode (DM) digital IP available to you.
|TABLE 1-2: Synopsys Digital Implementation IP Examples |
|IP Type Example |
(EP, SW, RC, DM)
|Explanation ||Comments |
|32-bit optimized x1 ||32 bit data path, one lane (x1). One lane only operation allows extra multi-lane logic to be removed for lower gate count and power. ||Supports a single lane (x1) with: |
8-bit PHY interface @250MHz
16-bit PHY interface @125MHz
|32-bit ||32 bit data path, one lane (x1) to four lanes (x4) ||x1 to x4 with 8-bit PHY interface @250MHz |
x1 to x2 with 16-bit PHY interface @125MHz
|64-bit ||64 bit data path, one lane (x1) to eight lanes (x8) ||x1 to x8 with 8-bit PHY interface @250MHz, or x1 to x2 with 16-bit PHY interface @125MHz |
|128-8 bit ||128 bit data path, one lane (x1) to 16 lanes (x16) ||Supports 8-bit PHY interface @250MHz |
|128-16 bit ||Pin selectable - root port or endpoint, one lane (x1) to eight lanes (x8) ||Supports 16-bit PHY interface @125MHz |
You can trade off gate count, operating frequency, and power versus maximum lane width, i.e. throughput.
Digital IP for Advanced Applications - Overview
It is relatively simple to select among root, switch, endpoint, and dual-mode ports if your application fits one of the examples shown at the beginning of this application note.
However, if you have an advanced application, you may want to know more about the differences. The following sections are a summary; see the DesignWare IP product databooks for complete information.
Digital IP - Upstream vs. Downstream Differences
A PCI Express hierarchy contains one or more PCI Express links, and includes a root port, optional switch ports, and endpoint ports.
Each link in the hierarchy must include exactly one “downstream” facing port and one “upstream” facing port.
- Root ports are always downstream
- Endpoints are always upstream.
- Switch ports may be configured to be either - see the PC system example at the beginning of this application note. In a switch device, the switch port closest to the root is always an upstream port; all the other switch ports are downstream.
Why does this matter?
- When the digital IP initializes the PCI Express link, the upstream and downstream ports uses a slightly different protocol to automatically configure the link for width, and for lane and data reversal.
- During idle times, the digital IP can autonomously transition the link into a deep low power state. This transition is requested by the downstream port, and “approved” by the upstream port.
Digital IP - Configuration Registers Differences
Each instance of PCI Express IP contains a set of configuration registers:
- Root and switch ports contain “Type 1” configuration registers
- Endpoint ports contain “Type 0” configuration registers
Type 0 configuration registers:
- indicate to system software that this device is the “end” of a PCI Express hierarchy
- contain a full set of so-call Base Address Registers (BARs) that help you filter and address map received packets
Type 1 configuration registers:
- indicate to system software that there are more devices to discover beyond this device
- contain limit registers to assist in packet routing to other devices
- contain only minimal BARs for packets mapped to this device
Some other configuration register differences:
Digital IP - Configuration Transaction Differences
- Endpoint devices may contain multiple copies of the configuration registers. This is used to implement “multi-function” devices.
- Root ports include extra registers to summarize error status for a PCI Express hierarchy.
- Root and switch ports contain registers to manage “hot-plug” events
Configuration transactions can only be initiated by root ports, and can only be responded to by endpoint ports and upstream root ports.
Configuration transactions are used to:
- Determine the topology of a PCI Express hierarchy
- Initialize the configuration registers after a PCI Express link is initialized. Many values can be initialized in hardware, e.g., using Synopsys coreConsultant. However other values, such as memory space enable, and base addresses, must be initialized via configuration transactions
- Change the power state of a device
- Read error report registers
Digital IP - Interrupt and Error Message Differences
As described in detail in another application note, PCI Express devices emulate PCI interrupt wires (INTA, INTB,....) by sending messages towards the root port:
- Endpoints and upstream facing switch ports may initiate these messages
- Downstream and upstream facing switch ports pass these message through to switch core logic
- Root ports may receive these messages
Note that other types of interrupt messages (MSI, MSI-X) do not have these restrictions.
Error messages are sent by PCI Express devices in response to link errors. Endpoints initiate these messages, switch points initiate them and pass them on, and root ports receive them. Implementing PCI Express into Your Design - An Introduction
The following diagram shows the major features of a simple endpoint design. See the DesignWare PCI Express IP databooks for details.
The Replay Buffer and Rx Buffer are respectively single and dual port RAMs. All logic to manage these buffers is included in the endpoint IP. For the receive (Rx) buffer, you may choose store and forward, cut-through, and bypass (no buffer) packet storage.
The Tx Fifo is optional - if your Tx DMA can continuously supply the data for an entire packet, the Fifo is not necessary.
The Internal Bus Adaptor is optional - it is only required if you wish to update so-called “read only” fields in the IP configuration registers before link communication begins. As an alternative, all of these fields can be configured at synthesis time with the Synopsys coreConsultant tool.
The Digital IP interfaces shown in the diagram may be configured to fit your application:
- The Tx Client builds packets for you from data, address, and other attributes presented by your logic. It also gates your packets according to the PCI Express rules for buffer space (“credits”) at the other end of the link. You can configure the IP to have up to three TX Clients.
- The Rx Target interface disassembles validated packets and presents them to you as data, address, and other attributes.
- Use of the External Local Bus interface is optional - it provides a convenient way for the processor at the other end of the link to read and write your local application control registers. No additional application hardware is required, in this case. The endpoint IP can be configured to map these register read/writes to the Local Bus interface.
- The Data Bus Interface provides “back-door” local access to the endpoint configuration registers. Usage of this interface is also optional. It was discussed above with respect to the Internal Bus Adaptor.
PCI Express is a robust interface and selecting the right IP can help solve the complexities of implementing the protocol into your designs and accelerate your development process. The DesignWare IP for PCI Express is silicon proven in customer designs and is the industry standard, powering the PCI-SIG protocol test card and the first to pass the compliance test.
The DesignWare IP has gone through extensive interoperability testing with third party PHYs, verification IP and hardware. By providing a complete solution for PCI Express including digital controllers, verification IP, and mixed signal PHY IP, Synopsys helps lower your integration risk and overall deployment costs, while saving you significant time and effort.
For more information on DesignWare IP, visit www.synopsys.com/designware