We are witnessing an explosion of new applications for embedded microprocessors-everything from MP3 Internet-audio players to wireless phones with Web browsers. Although the general-purpose processors that have dominated the embedded market for two decades still have their place, off-the-shelf chips are often a compromise. The ideal processor for an industrial robot or antilock braking system is rarely the best solution for a talking toy or digital camera. No vendor can supply a range of standard parts broad enough to meet the growing diversity of new products.
That's why ASICs are becoming more popular. An application-specific chip that combines a microprocessor core, memory and peripherals on a single piece of silicon can dramatically improve performance, reduce power consumption, cut manufacturing costs and allow embedded-system developers to differentiate their products from those of their competitors. Configurable processor cores offer even more flexibility. ASIC developers can modify the CPU core itself, not just add memory and logic around a fixed CPU.
To fine-tune a configurable core for a specific application, a developer might alter the caches, add custom extensions to the base instruction set, create new registers, define new condition codes and make other changes that were formerly the exclusive privilege of CPU architects.Malleable configurability
Indeed, configurable processors make embedded hardware as malleable as software. One challenge, however, is to make software-development tools that are equally malleable. If the software tools are not as configurable as the processors, developers will find that compilers, assemblers, linkers, debuggers and simulators don't recognize new opcodes and other custom features. To be effective, the modifications that ASIC developers make to the processor must extend all the way down the tool chain.
Ideally, software tools would automatically adapt to any modificat ions a developer makes to a configurable core, no matter how radical the changes. However, this ideal is almost impossible to achieve without true artificial intelligence. Consider the challenge of adapting a high-level compiler to custom CPU instructions. Despite decades of research and development, compiler writers still have trouble designing code generators that can match the efficiency of expertly written assembly language. That's true even when the compiler is targeting a conventional fixed instruction set. To automatically recognize a new custom instruction and know exactly when to apply it, a configurable compiler would have to be smarter than the world's best compiler writers are. In effect, the compiler would have to generate its own optimizing code generator.
To meet this challenge, there are two general approaches to configurable tools for configurable processors. Both involve trade-offs, however.
One approach is to constrain the possible modifications to the configurable core so it's difficult or impossible to break the tool chain. Such constraints might take the form of more limited CPU configuration options, a proprietary modeling language that's more restrictive than industry-standard Verilog or VHDL, or an extra verification process in which the configurable-core vendor examines, tests and synthesizes the customer's design. This approach trades off some of the ASIC developer's flexibility and privacy for greater tool automation.
An alternative approach is to let ASIC developers make any modifications they wish to the CPU, even to the extent of giving them full access to the VHDL or Verilog model of the core.
As with all freedoms, however, this one comes with a bit of a responsibility. In some cases the developer must tell the software tools what's going on by writing brief macros, compiler directives or DLLs. This is necessary because the core vendor cannot possibly anticipate all the modifications a creative developer might make to the core. The range of modific ations is essentially infinite.
Although sometimes the core vendor can adapt the software tools to the developer's modifications, some developers want to protect their modifications as private intellectual property. Indeed, some ASIC developers have won patents for their core extensions, protecting them even from the scrutiny of the core vendor. Therefore, the second approach to making configurable tools trades off some automation for much greater design flexibility. Deserving approaches
Each approach has merits. However, ARC Cores favors the second route. A major reason for choosing a configurable processor in the first place is to create a highly optimized ASIC without the restrictions of conventional CPU architectures. That's why ARC Cores leans toward greater flexibility for the ASIC developer, even if it means somewhat less tool-chain automation. As configurable technology improves, more automation can be built into the process without sacrificing flexibility.
No matter wh ich approach a core vendor takes to designing configurable software tools, there are two primary goals. First, it should be as easy as possible for developers to configure the tools to work with the configurable CPU core. Second, the tools must provide feedback that helps programmers and ASIC developers fine-tune the core for the application. Remember that a key advantage of using a configurable core is that developers can optimize the hardware to fit the software, not just the other way around.
Without good feedback from simulators and related tools, ASIC developers won't know how effectively their custom instructions, cache alterations and other enhancements are removing critical bottlenecks.
To tackle the first challenge of making tool configuration as easy as possible, configurable-core vendors typically offer graphical design aids. ARC's version is a graphical tool called ARChitect that lets ASIC developers choose the most popular CPU extensions and configuration options by clicking on b uttons and menus. Developers don't have to manually modify the underlying VHDL/Verilog model of the core.
The result of using ARChitect is register-transfer-level output and synthesis scripts for industry-standard EDA tools, such as Synopsys Design Compiler. After synthesis, developers can test the resulting gate-level netlist with standard cycle-accurate simulators-such as Cadence's Leapfrog or Model Technology's ModelSim-or with a PLD in the ARCangel prototyping system. As a result, it's possible to verify the design at every stage of development. Later, placement and routing tools generate a GDSII database for any digital process at any IC foundry. Configurable help
ARChitect also helps configure the software tools. ARC's tools are from MetaWare (Santa Cruz, Calif.), a subsidiary that supplies tools for the ARC, ARM, StrongARM, XScale (StrongARM-2), MIPS, PowerPC and PicoJava architectures. All the standard CPU extensions available in ARChitect are reflected in a special macro file for MetaWare's High C/C++ compiler and assembler. These macros configure the tools for the standard extensions to the base-case ARC instruction set. MetaWare's High C/C++ compiler and assembler are smart enough to use the new instructions when generating executable code, allowing a smooth design flow.
For example, let's say an ASIC developer uses ARChitect to add a barrel shifter to the ARC microprocessor core. This design choice automatically adds five new instructions to the base-case ARC instruction set. The configuration file tells the MetaWare tools everything they need to know about the new instructions: the mnemonics; op codes; number of operands for each instruction (in this case, three); whether the instructions are conditional (ARC instructions are conditional by default); and whether the instructions modify any condition codes (developers can define their own condition codes, and can make new and existing instructions conditional upon the new codes). All this information requires only one line for each instruction in the macro file. Here's an example for the arithmetic shift left (ASL) instruction: .extInstruction asl,0x10,0x00,SUFFIX_COND|SUFFIX_FLAG,SYNTAX_3OP
When programmers write a C statement such as "x << 23," the High C/C++ compiler knows if the processor has a barrel shifter and a new instruction called ASL. The compiler would automatically use the new instruction to shift the value in the 32-bit variable "x" to the left by 23 bits-with a single line of machine code. Without the barrel shifter and ASL instruction, the same operation would require multiple instructions, possibly including some register operations to preserve the original value.
The tool-configuration pro-cess described above applies to the standard extensions available in ARChitect. Of course, no tool like ARChitect can anticipate the virtually infinite variations of custom instructions that talented developers might want. Limiting developers to the extensions available in ARC hitect or constraining the kinds of extensions they can add would negate some advantages of using a configurable processor in the first place.
So, when developers create application-specific instructions using VHDL or Verilog, they also configure the software tools by writing brief assembler macros and compiler pragmas that declare intrinsic functions. The compiler translates each function call into a single ARC instruction. Developers can call the functions in their C/C++ programs, or use inline assembly macros.
For each new instruction, a pragma defines the mnemonic, the opcode, the number of operands, whether the instruction modifies any flags and whether the compiler can safely reschedule the instruction with respect to surrounding instructions. The following pragmas declare two intrinsic functions that will call custom instructions named EX2 and EX3:
extern int extract2(int,int);
e xtern int extract3(int,int);
Normally it would require 10 instructions including several messy bit-shifting operations to compress or decompress 12-bit Lempel-Ziv codes, but EX2 and EX3 can do the job with only four instructions. And they could execute in a single clock cycle, boosting performance by more than 100 percent. Custom instructions can execute in multiple cycles if necessary.
Further down the tool chain, developers can set a few switches to tell MetaWare's SeeCode debugger and simulator about the standard extensions selected in ARChitect.
These are command-line switches if a developer launches the tools from that kind of environment, or the familiar Windows check boxes if a developer prefers to use a graphical front end such as StarBase's Code-Wright. For custom extensions in Verilog or VHDL, DLLs tell the instruction-set simulator how to handle the new instructions. The DLLs aren't necessary when using the ARCangel prototyping system. In any case, the SeeCode debugger automatically adapts to the new instructions, condition codes and registers defined by the assembler directives.
The Holy Grail of configurable tools is a complete tool chain-compiler, assembler, linker, debugger and simulator-that automatically morphs itself to accommodate any custom extensions the ASIC developer dreams up. Programmers could work exclusively in a high-level language and trust the compiler to automatically use any relevant extensions when generating executable code. The tool chain would offer this automation without limiting the developer's flexibility in any way. Further, the debugger and simulator would close the feedback loop and allow the ASIC developers and programmers to work toward a fully optimized hardware/software solution.
Nobody can offer such an ideal tool chain today. However, the few missing links haven't stopped dozens of ARC customers from taking advantage of the unique ca pabilities of configurable processors and configurable tools-capabilities that standard parts with fixed architectures cannot match.