How to use register retiming to optimize your FPGA designs
By Darren Zacher, Mentor Graphicspldesignline.com, Dec. 14, 2005
As frequencies continue to increase in complex FPGA designs, finding the optimal point for pipeline stage insertion so as to manage routing delay issues may not be so easy. Register retiming comes in very handy in these situations, and this article outlines recommended practices that show you how to qualify an FPGA-based design as a good candidate for register retiming, along with specific examples for optimal performance results.
As an increasing number of high-performance designs are now being realized using programmable logic platforms, designers need to figure out how working with these platforms differs from traditional cell-based design processes. The effect of routing delay is a case in point.
As one of the discerning designers among the growing legions of engineers tackling FPGA designs today, you will inevitably find yourself increasingly restricted within the fixed programmable interconnect network. Not having complete freedom with respect to signal routing can become a rather tricky proposition.
In many cases, achieving performance requirements hinges on sequential elements being optimally placed so as to minimize and balance path delays. Historically, when the routing performance did not comfortably meet requirements, you simply inserted additional pipeline stages manually, at strategic points within large combinatorial logic paths, to reduce and balance path delays. Accommodating these extra pipeline stages was generally not an issue, as most programmable logic architectures offered ample sequential elements.
As design frequencies continue to increase, however, determining the optimal point for pipeline stage insertion may not be so easy. One way to overcome this is to make the most of the algorithms offered in today's EDA tools, such as register retiming, which is an optimization strategy that leverages positive slack on one side of a sequential element to address or balance negative slack on the other.
The register retiming algorithm works by literally "moving" registers across portions of combinatorial logic such that the worst-case combinatorial delays on each the input and output sides of the register are more balanced. Of course, in order to move the register, the algorithm must take great care to preserve all reset, preset, and enable functionality associated with the register's original situation within the circuit.
It's important to be aware that each synthesis tool has its own implementation of a register retiming algorithm. A good implementation, such as that used in the Precision Synthesis tool from Mentor Graphics, will allow you to move registers either forward or backward across combinatorial logic in order to reduce negative path slack. As seen in (Fig 1), register retiming can lead to either an increase or decrease in the number of flip-flops in the design. If an increase occurs, accommodating these extra flip-flops is generally not an issue, as most programmable logic architectures today still offer an ample supply of sequential elements. Moreover, while the number of flip-flops may change, the number of pipeline stages does not; in fact register retiming is constrained to operate only in such a way that preserves design functionality at the top-level design ports. Hence, the algorithm will only use the pipeline stages that are described in the circuit.
Click here to read more ...