UK intellectual property (IP) supplier DCT has found a way to speed up Java processing on ARC International's processor core without adding any specific Java instructions to the architecture.
The companies have cut a deal that will see ARC market DCT's Java accelerator as part of the ARCtangent-A4 core, with the two partners splitting royalties.
The software for the DCT-ARC Java processor is based on open-source Kaffe software. The core has about 5000 extra logic gates.
DCT's work takes advantage of the way that the ARCtangent handles registers to speed up Java virtual machine instructions without creating new instructions.
Matt Kubiczek, chief technology officer and founder of DCT, said: "It is a combinatorial bit of hardware between the instruction fetch and operand fetch parts of the pipeline. There is no extra pipeline stage. Other techniques, such as [ARM Holdings'] Jazelle, add a pipeline stage."
With the DCT hardware , the ARC core does not run Java byte codes directly. Instead, the hardware is tuned to speed up operations that manipulate data sitting on a stack.
The JVM is based on a stack architecture, but most risc processors use a bank of directly addressable registers, not a stack. The DCT hardware allocates an extra set of registers in the ARC processor that are addressed as a stack. Any instructions that access those registers are assumed to work in stack mode, in contrast to their behaviour when used on the main register file.
Because the processor does not execute Java byte codes directly, they have to be converted to modified behaviour ARC instructions.
"The translation is performed by the class loader into simple augmented ARC instructions or into threaded calls or subroutines," said Kubiczek.
The more complex byte codes, such as those that load an object, are translated into subroutines. More simple arithmetic byte codes can be converted in one ARC instruction or reduced to one part of an instruction.
The translation can either happen at runtime, for downloaded classes, or the pre-translated code can be stored in rom and run directly.
"There is no dynamic translation as in the Nazomi approach. The overhead is the look-up needed to translate the JVM byte codes [when the class is loaded]," said Kubiczek.
As the register file is limited in size, the DCT accelerator deals with deep stack operations by flushing out data to main memory and then reloading when it is needed. The procedure is similar to that performed by the Sun Sparc processor, which makes extensive use of register 'windowing'.
But the DCT engine dynamically remaps registers so that the top of the stack can move to any point within the register file.
"The good thing about Java is that every method declares its maximum stack usage. The manipulation can be done by method prologue and epilogue code," said Kubiczek.
The software for the DCT-ARC Java processor is based on the open-source Kaffe software. ARC will sell the Java core, which has about 5000 extra logic gates, and split the royalties between itself and DCT.