by Yoshihide Sugiura, Pacific Design Inc.2-12-2 Shinyokohama, Yokohama-shi, JapanAbstract :
VUPU is a configurable processor for ANSI-C entry. PU is a basic RISC processor and VU is a variable cycle data path unit that is design by users. The basic processor PU is provided as RTL source code. To design the VU, design tools of compiler, ISS and profiler and ANSI-C description style are also provided. Not only the RTL source code and tools but the ANSI-C description style is proposed, the VUPU can be understood as ANSI-C entry design methodology. The application area of VUPU is real time signal processing.VUPU: Configurable Processor for C Level Design1. Introduction
We would like to introduce our C level hard-soft co-design methodology by using a set of small processor and co-processors called VUPU.
VUPU covers ANSI-C description into software part running on Risc Core Processing Unit (PU) and into hardware part implemented as Variable Data Path
As shown in Figure1, VUPU is understood as co-processor type configurable processor, where VU co-processors can be implemented not only arithmetic calculation but also quite elaborated computation functions such as "looped sum of products" or "FFT". One PU can control a maximum of 256 VUs that can be designed by user to solve computational bottle necks in the original C.
Figure1. Architecture of VUPU2. VUPU Features
User can assign a large number of clock consuming part of arithmetic or function to a special purpose hardware execution unit, i.e. VU. VU is linking with PU so that VU can be understood as hardware function or co-processing unit.
As shown in Figure2, there are two kinds of VU,i.e. Arithmetic VU and Functional VU. The interface of arithmetic VU is that of general purpose registers of PU. In that case, it seems like RISC arithmetic instructions are extended. On the other hand, in the case of function call, the interface becomes data RAM.
VUPU provides three kinds of data interface between PU and VU.In the case of arithmetic VU, general purpose register interface, PU compiler can process automatically. We are using ACE/CoSy compiler that does support that kind of external arithmetic instruction interface. In the case of function VU, data RAM interface is provided, and inline assemble call/return swaps the software to hardware function. Other interface between PU and VU is the "Move" instruction. A user can define 32bit x 256 registers in VU that can be read and written as source and destination of the "Move" inst.
Figure 2. VUPU Design Methodoligy3. VUPU Tools and Description Style
Three kinds of tools are provided. The first one is C compiler. We are using ACE/CoSy Compiler. The second one is instruction set simulator(ISS) with cycle accuracy. The third one is profiler that analyze the cyle of functions encountered during ISS simulation.
By using profiler, a use can understand the hot sopt function very easily in top down manner.
Function model and number of cycles of VU can be defined by C/C++ and it becomes dynamic link library for the ISS. Function model of VU can be described as behavior or cycle base where cycle base ANSI-C description style is also provided. Therefore, a user can design VUPU from ANSI-C algorithm without using RTL simulation.
The design of VU can be possible by introducing a behavior synthesis, but it seems a little bit earlier stage yet.
The ISS speed is 10-1000 times faster than RTL depending on the function model of VU. By this speed and accuracy, VUPU design time is improved almost 10 times than RTL design.4. Size and Performance
Table 1. Size and Performance of VUPU5. Position and Application of VUPU
Figure3 shows VUPU position in the silicon implementation mapping.
VUPU is providing co-processor oriented hard-soft co-design than hard wired design, and is providing user definable co-processor method than DSP.
Flat TV, Liquid Display Panel, Plasma Display Panel, Digital Video Camera, Digital Video Disk, Encryption/Decryption and WCDMA Base Band are the actual application of VUPU users.
All these application users have C algorithm and require FPGA implementation before ASIC.
As shown in Table 1, VUPU dose work on both FPGA and ASIC. This silicon vendor independent core and ANS-C design methodology without RTL simulation is suitable for the real time signal processing users.
Figure 3. Position of VUPU