Hybrid approach to early Validation and SW bring-up

Evolution of technology conforming to Mooreâ€™s law has enabled packing of billion transistors on a single chip. This evolution in HW manufacturing processes has also changed the basic premise on which architectureâ€™s were conceived in previous generation of chips. Now a sizable part of chips manufactured are SoCâ€™s i.e. system on chips which can be considered as self sufficient system up to an extent by the virtue of intelligence running on the onboard processor in form of the native software.

SoC development in a way depicts the advancement in the complete design ecosystem, which used to be similar to disjoint sets of HW design, SW design, systems design etc. Now itâ€™s similar to intersection of sets where each of HW design, SW design and systems engineering are interdependent to a large extent. Also the increase in the complexity and market dynamics has resulted in need for cycle time reduction for a system e.g. there used to be new phone, TV, computer released once alternate year, now itâ€™s a ritual almost every alternate month.

Figure 1: Product development flow

Fierce competition has fueled innovation as well as requirement for reliable products, which essentially implies much more testing and increase in cycle time because of sheer complexity of the todayâ€™s system. The requirements of reliable products imply reliable design i.e. HW as well as software at system level. A complex system can be a superset of sub-systems based on multiple SoCâ€™s e.g. todayâ€™s automotives have multiple sub-systems working in tandem etc. This brings back focus on importance of all the component sub-systems being reliable by undergoing thorough validation and qualification process standalone.

Competition in market place has resulted in strict timelines for the complete product development, allowing very little room for time exhaustive pre/post Si validation of sub-system components. These are extraordinary times which demand revolutionary approaches to achieve the quality of silicon and system to succeed.

The Problem:

It has been observed that conventional approach of â€œexhaustive pre-silicon verification and post-silicon validationâ€ seems inefficient and weak in dealing with the complexity and time lines of current generation multi-million gate SoCs. The scenario would get tougher for every subsequent generation with further increase in complexity.

The conventional methodologies of exhaustive simulation verification followed by exhaustive post silicon validation are becoming impractical owing to following points:

Simulation run times for multi-million gate designs sometimes run in days for single test scenario
The number of test patterns required to cover all the functionality increases exponentially with increase in complexity of the system
Software dependence of the system requires HW/SW interoperability tests which are too slow to be run on simulation and too late to be run on silicon
Increased complexity of system has resulted in validation software to be as complex as system/application software and its development cannot be deferred till silicon availability
The advancement in process has also resulted in increase of NRE cost on silicon and any avoidable silicon re-spins are welcome as they help the bottom line of the organization.
Large number of sub-system has dependence of third part vendors to create patches for debuggers, flash programmers etc. which cannot be deferred till silicon availability for obvious reasons.

Certain options available are listed below, which can target the issues listed above. But still have some shortcomings associated with them and would not qualify as a holistic solution to the problems.

Accelerated Simulations: The simulations can be accelerated using accelerating hardware (GPU based accelerators) but speedup is dependent on type of design, but still would not be fast enough to be viable for software development. Possibly emulation boxes can be used to run accelerated simulations, but requires simulation infrastructure be SCE-MI compliant to achieve the advantage. This in most cases is impractical because legacy infrastructure (IPâ€™s, VIPs etc.) used as part of development cannot be replaced in an instant.
In circuit Emulation: Emulation boxes can be used to emulate the hardware design and run native software on top of it along with debuggers. The emulation model can be brought-up in couple of week time which can be used to extract advantage of early availability of HW behavior. Unfortunately for real life SoC designs, the emulator speed is only in kHz, which makes heavy software debugging extremely painful.
FPGA based emulation/prototyping: FPGA based prototypes with speeds in MHZ can be effective tool for Validation/Application software development. But large upfront bring-up time (1-2 months) is a hindrance, with shrinking SoC cycle time the advantage of a fast running platform is pretty much lost in this case. The effort to bring-up the platform is considerable but there isnâ€™t sufficient time to utilize it for validation/Application software development resulting in low ROI for the approach.

Table 1: Summary comparison of different approaches

S.no.	Parameter	Accelerated Simulations	In Circuit Emulation	FPGA based Emulation/Prototyping
1	Bring up time	Fast	1-2 week initially	1-2 month initially
2	Partitioning	Not required	Not required	Might be required(extra effort)
3	Debug Visibility	Excellent	Good	Poor
4	Speed	Very Slow (Hz)	Slow (khz)	Fast (Mhz)
5	Design changes	Minimal	Only Synthesizable models acceptable	Only Synthesizable models acceptable along with restrictions on resources e.g. clocking, memory etc.

Figure 2: Conventional SoC development Flow timeline

Figure 3: Accelerated Simulation based flow timeline

Proposed Approach:

The approaches based on Emulation boxes and FPGA platforms independently have pros and cons. Organizations might pursue either approach for focused development effort, which can result as a costly compromise by the virtue of the fact that neither covers complete SoC development cycle as expected.

Even if organizations deploy both the approaches, they are used in a disjoint way and deprive them of the true advantage that possibly can be harvested out of both the flows.

We are proposing a hybrid approach to address the pain points in Emulator based approach with FPGA based approach and vice versa. This enables harvesting of true potential of both the approaches working in tandem and reinforcing each other, serving the end goal of validation, verification, SW enablement and interoperability testing well before silicon is available.

Since the approach is based on using emulator and FPGA prototypes in tandem, we assume the user has access to both of them simultaneously. Also user has basic understanding of emulation and prototyping flows.

The Approach:

In an attempt to solve a problem we need define the problem. We would refer to Table 1 to elaborate on pain points and provide a possible way to circumvent the issue.

1. Bring up time

Issue:

Initial bring-up of an emulation platform is governed by two factors, what you know and what you have i.e. what all you know about the design and information/resource available to accomplish the porting effort.

Table 2 categorizes the effort required to emulate a design from RTL porting perspective, this is where â€œwhat you know?â€ becomes important.

Anyone who has ever been part of prototyping effort would agree that it is this initial phase which is very frustrating because of the way tools behave. Sometimes it takes couple of weeks just to get the RTL through elaboration phase of FPGA synthesis tools. Unfortunately in real world designâ€™s the information required (Table 2) is not available up-front to an emulation engineer and need to be discovered partly by multiple iterations of design compilation.

Table 2: RTL Porting Effort

S.No.		FPGA based Prototyping	Emulator based
1	Clock	Design clock generation module using FPGA componentsUnderstand the clocking structure for optimized clock routing. Constrain the design with realistic scaled down frequencies ratios.	Identify the clocks drivers and define them as clocks within a scripts with the desired frequency
2	Standard Cell Library components	Model the functionality of the component with behavioral description
3	Memory	Generate memory models using FPGA based components or synthesizable behavioral description can be generated using simple scripts	Synthesizable behavioral description of the memories can be generated using scripts
4	ROM Code	Need to Convert Rom Code	Directly portable
5	Partitioning	Required if resource requirements exceed one FPGA device. This adds to the complexity and stabilization time of the prototype.	No Partitioning Required
6	Analog blocks	Stubbed out	Stubbed out

Solution:

Run times for RTL compiler for an emulator are very small compared to FPGA synthesis (because of the fact that, it does not try and optimize but just checks for syntax and elaboration issues), which helps immensely in getting through the process of identifying components which cannot be supported and probably would need to be modified or removed e.g. compilation time on emulator compiler is in minutes while a FPGA compilation tool takes hours to compile and generate errors if there are any, and they are generated one error at a time.

The approach has to be â€œemulation with target of prototypingâ€ i.e. all the effort put in to make the design usable on an emulator should be with the intent of being re-usable on FPGA platform e.g. do not use black-boxes when you want an module to be stubbed out, as emulation compiler would tie the outputs to logic â€œ0â€ but FPGA synthesis tool would infer as a black-box.

All the changes required to be made in the design should target FPGA implementation. This would automatically cover the emulation requirements as they are super-sets of the FPGA requirements.

Once the design is up and running on the emulator, if done right it should be synthesizable as is for FPGA and save immense pain and frustration and time in handling of FPGA synthesis errors one at a time.

2. Partitioning

Issue:

This aspect is transparent to the user in an emulation flow, but this can be most challenging aspect for FPGA based flow. Multiple partitioning iterations are a harsh reality a prototyping engineer has to face before reaching the right partitioning for the design.

Although there are automated partitioning tools available from few EDA vendors, it still is a challenge to verify if the design works fine post partitioning, most of the time the uncertainty gets introduced because of pin-multiplexing associated logic added during the partitioning process.

Solution:

Creating a wrapper around the top module and using the partitioned files for implementation on emulation flow, would provide a fast way to verify the partitioned logic (typically for POR behavior) and possibly connectivity with the debugger for enhanced confidence on post partitioned design.

It would also pave way for faster debug in case you do find an issue in the post-partitioned design. Also this establishes a path for validating netlists generated at different stages in the flow if required e.g. post partitioned behavior is different than post PnR because of some anomaly in flow.

3. Debug Visibility

Issue:

Emulators are inherently slow, but provide a great advantage of debug visibility. User can possibly probe all the signals available in the design, but being slow makes it painful in getting to right trigger condition to reproduce the scenario e.g. a failure scenario occurred after 41 hours of starting a run on emulator while it took only 20 mins to replicate the failure.

FPGAâ€™s on other hand do not provide the kind of debug visibility as emulators, owing to resource crunch.

Solution:

The capability of emulators on debug visibility can be used as an asset while debugging of FPGA platform. In the instance specified above user can possibly identify the trigger using an FPGA platform and use that trigger to create a waveform dump for signal states around the trigger event for entire design in one run. This would enable rapid root causing capability by the virtue of the fact that complete design state is available to the design team for deriving a conclusion.

Also, using the emulation and FPGA flow in tandem enables the team to try possible design change which would circumvent the issue in emulation before moving it to FPGA platform, as the number are trials to fix the issue increase so does the advantage of this approach.

4. Speed

The true potential of Pre-Si validation can be harvested only if design platform be available well before tape out enabling the development of basic validation software. In process the infrastructure e.g. debugging tools, patches peripherals etc. required for the use by complex validation and application software is taken care.

This streamlining of flow ensures that the subsystem bring-up on comparatively faster FPGA platform is just a procedure and so would be running the complex part of HW-SW interoperability stressing the design to be robust. This is in line with the goal of having each of the pieces of the system being very reliable, reducing the validation time required at full system level.

Conclusion :

Figure 4: Proposed Hybrid approach timeline

Emulators are slow and FPGAâ€™s based prototypes are fast, there is not much that can be done to increase the speed of emulators but the head start in design bring-up that an emulator can provide can be of an immense advantage.

Using Emulatorâ€™s and FPGA prototypes in isolation has been a trend as their DNAâ€™s were considered to be different, but we have tried to highlight the similarity in the two approaches and how they can be amalgamated to compensate for each otherâ€™s weakness and reinforce the advantages, ensuring reduced SoC validation cycle times, better silicon quality and early availability of Pre-Si platform for SW development as depicted in the diagram above.

Legend:

Acronym	Explanation
SoC	System on Chip
FPGA	Field Programmable Gate Array
HW	Hardware
SW	Software
SCE-MI	Standard Co-Emulation Modeling Interface
GPU	Graphic Processing Unit
IP	Intellectual Property
ROI	Return on Investment
EDA	Electronic Design Automation
POR	Power on Reset
PnR	Place and Route

Author :

Devendra Singh, a Design Lead with Freescale Semiconductor. He holds a bachelor degree in Electronics and Telecommunication from Army Institute of Technology, Pune. He likes to work on Emulation, FPGA Prototyping and Design Validation .He can be reached at Devendra.singh at freescale.com.

Amit Garg, Working as design engineer at Freescale Semiconductor having two years of experience. He holds a bachelor degree in Electronics and Telecommunication from National Institute of Technology, Kurukshetra. He is responsible for Emulation, FPGA Prototyping and Design Validation .He can be reached at amit.gargat freescale.com.

Industry Articles

Hybrid approach to early Validation and SW bring-up