By Shridhar Laddha, Barun Kumar De
SoftJin Technologies Pvt. Ltd.
As the cost of mask is increasing and the performance gap between FPGA and ASIC is reducing the FPGA is evolving a strong platform for not-only prototyping but also as a platform for real time design. But one of the major challenges that still remains is using of FPGA for large SoC design. One of the major differences between FPGA and ASIC is availability of resource. In ASIC the designers need not to worry of that much about the number of interconnections between sub-modules or they have more flexibility in terms of number and placement of gates. But FPGA has fixed number of I/O or CLB within which the designer needs to implement the design, thus resulting in a limitation of the size of SoC can be implemented in a FPGA platform. But this limitation can be overcome if we have multiple FPGAs to implement the design. But using multiple FPGA implies multichip design and there are several issues which need to be taken care.
Role of EDA tools
Several of these issues can be treated effectively by EDA tools which are used in multi-FPGA design. EDA industry needs to consider these new challenges and provide necessary tools which can facilitate the multi-FPGA designing. Few other issues can be solved by adopting more intelligent design methodology in multi-FPGA design. In this paper we have tried to identify how EDA tools vendor can play a role both in tool development as well as defining design methodology to improve the efficiency of multi-FPGA design.
System Level Design
There are several areas where efficient ESL tools can affect the performance of a multi-FPGA design. For example
More Parallel Processing: Multi-FPGA designing of SoC gives the opportunity of using multiple processors (soft or hard). EDA tools like compilers should be capable of taking the advantage of this and execute more parallel processing of the tasks to enhance the system performance. The compiler should also be able to implement proper load balancing between those processors to ensure that idle time is minimized.
Run-time Reconfiguration: Though run-time configuration is less popular due to substantial reconfiguration time required by most of the FPGA devices, the designer can look run-time reconfiguration to reduce the hardware resources needed to implement SoC in a multi-FPGA platform. The underlying concept is basically if the SoC has two processes which are mutually exclusive in nature and have sufficient time difference between them, same FPGA can be used to implement both the process and the FPGA needs to be reconfigured in run-time.
Partitioning is perhaps one of the most important and most discussed steps in a multi-FPGA design/ prototyping. A well thought partitioning not only reduces the total cost of the system, but also improves the performance of the system. Few of the well discussed techniques in partitioning are
Identification of concurrent tasks: As the number of FPGA used to implement the design increases the complexity needs to be handled by the partitioning tool also increases. The design needs to be partitioned in such a manner that the FPGAs are used to maximum possible extent, thus reducing the number of FPGA needed. If the FPGA is used for prototyping or verification, not for actual implementation then the SoC can also be further investigated for concurrent tasks and same FPGA can be used for the modules which are not needed to be verified concurrently.
Logic Swapping: I/O is one of resource constraint for multi-FPGA partitioning. In ASIC design I/O multiplexing is used extensively for the primary pins, it is seldom used during the sub-module design as only the number of primary input/ output pins has direct impact on die size and cost. But in multi-FPGA design the number of interconnects between sub-modules also matters as that will impact the number of I/Os used in the FPGA device. One of the ways to overcome this problem is to break the logic resources like register array etc. and integrate them with the logic which they are controlling. Thus wider bus crossing different FPGA boundaries is no longer needed, reducing the constraint on I/O count of the FPGA devices.
Without Register Break-up
With Register Break-up
Logic Re-grouping: Logic re-grouping is another method of reducing number of connections between multiple FPGA boundaries. In this approach the two FPGA blocks having maximum number of connections is chosen (with maximum global cut). The modules in those two FPGA devices are swapped to reduce the number of global cuts. If the two FPGA have limited resource then the modules can also be re-grouped and implemented in a third FPGA device which has sufficient space left.
Logic Replication: Some time to avoid high interconnection between two FPGAs the module which has maximum contribution in the number of interconnections is duplicated in both FPGA devices. This will also help to overcome the timing problem arsing due to high delay resulted in the inter-FPGA boundaries.
Without Logic Replication
With Logic Replication
Timing closure in a multi-FPGA design is a big hurdle due to insertion of inter-FPGA delay which is much more than intra FPGA delay. Some of the methods used by EDA tools to ensure timing closure in multi-FPGA design are
Use of Register Boundary as Partition Boundary: To avoid timing related problem partitioning between multiple FPGAs should be done at register boundary even sometime if it implies swapping of some logics between two modules. This will reduce the stress on timing closure tool due to less occurrence of inter-FPGA delay (which is definitely the significant part) in the design.
Duplicate clock and reset logic: Spreading of clock and reset path across multiple FPGA results in huge skew delay making the task for timing analysis tool tougher. To avoid these clock and reset, needed for a logic implemented inside a FPGA device, can be generated internally so that the timing tool only needs to take care of the skew generated from internal logic of an FPGA which is much less than the delay between to FPGA chips in all practical cases.
Innovative Board Design: Multi-FPGA design particularly in high speed application demands innovative board design because now the high speed signal has started to cross chip boundaries and challenges related to high speed interface needs to be considered during board design. Few of the generally accepted techniques used are keeping clock signal in a single layer, usage of differential topology instead of single ended topology, minimizing parallel run between high speed signals etc.
Avoidance of Gated Clock: Gated clock makes the timing closure of multi-FPGA system more complex due to the huge skew inserted by inter FPGA delay
With Gated Clock
With-out Gated Clock
Role of Design Methodology
The design methodology for multi-FPGA design particularly in high speed application also needs to be different and needs to take care special requirement of multi-FPGA designs. For examples
Close Communication between System Designer and PCB Designer: As there are lot of interactions needed between the system designer and PCB designer. So instead of the traditional methodology where the PCB designing is done after the system is developed and tested with different simulation techniques, PCB design and FPGA design should run in parallel and there should be communication between the two teams. One of the examples is the FPGA I/O design and placement of FPGA on PCB. By simultaneous I/O design planning and FPGA placement by both the teams important objectives like meeting of overall timing (both FPGA in-chip and on board), meeting of PCB signal integrity constraints, less number of PCB layers and less PCB area can be achieved.
Hardware Implementation of a Task: The RTL design should be in such a way that a task is confined within a module as much as possible. This will increase the probability of the whole task getting executed inside a FPGA. Otherwise not only extra handshaking signals are required between two FPGAs, but also the handshaking logic increases the latency of the operation
Usage of High Level Verification Models: Adoption of high level verification techniques like transactional level modelling, cycle accurate modelling etc. are very important for multi-FPGA design. This will allow simulation of the system at very early stage and changes can be done more easily at very early stage of the design.
I/O Multiplexing: Instead of confining I/O multiplexing only at the primary pin level, I/O multiplexing should be done at sub-module level also wherever possible.
About the Author
Shridhar Laddha is co-founder and Director (Engineering) at SoftJin Technologies. Shridhar has more than 13 years of engineering management experience EDA tools, FPGAs and Design IPs. He also worked for Silicon Automation Systems in and Synopsys. His areas of expertise include: FPGA based System Design, Logic design tools, Hardware Description Languages, FPGA Synthesis tools. He has published several technical papers in international conferences. Shridhar holds B.E. (Computer Science) from Pune University and M.E. from Department of Computer Science, Bombay University.
Barun Kumar De is Manager – Sales and Marketing at SoftJin Technologies. Barun has more than 6 years of experience in companies like Open-Silicon, Texas Instruments and Atrenta. His areas of expertise include: industry/ market/ customer analysis, development of collaterals, company website, driving branding/ PR strategy, management of sales reps, business network, managing sales closure, contract and other legal issues. Barun holds B.E. (Electronics) and MBA (IT Marketing).
SoftJin Technologies Pvt. Ltd. develops customized EDA tools for the specific requirements of semiconductor companies using a combination of EDA building blocks and R&D services. Programmable platforms such as FPGAs are a special focus area wherein SoftJin offers best-in-class FPGA Synthesis and Timing driven Routing engines that can be optimized for individual device architectures. If needed, SoftJin can provide the optimization service to develop end user tools. SoftJin’s Automated Test and Regression Environment and software lifecycle services enhance the quality of FPGA tools. SoftJin also offers IP design, customization and Reference design development services specially targeted for FPGAs and other Programmable Fabrics by reusing its IP building blocks in areas like Digital Signal Processing, Digital Image/ Video/ Audio Processing, Memory Controller and other peripherals etc. More information is available at:
The company's headquarters are located at Unit No: 102, Mobius Tower, SJR I - Park, EPIP, White Field, Bangalore – 560066, Tel: +91- 80- 41779999, E-mail: email@example.com The USA office is located at 2900 Gordon Ave, Suite 100-11, Santa Clara, CA 95051, Tel: (408) 773-1714, Email: firstname.lastname@example.org