Neha Srivastava, and Aashish Mittal, Freescale Semiconductor
Embedded.com (November 4, 2012)
The Dhrystone 2.1 integer benchmark is a widely-used performance benchmark. However, there is no well-defined official reference specification for it, detailing the exact procedures needed to run the program and validate its correct operation.
Most existing references talk about the calculations of the Dhrystone number based on the core instruction cycles consumed , or they analyze the *.c (code) and *.h (header) components of the program being executed. [1,2]. They often discuss the relative advantages and drawbacks of the metric compared to other standards , or deal with little more than basic high-level descriptions of the program .
It's easy to get confused by the huge amount of data available, making it difficult to develop clear and unambiguous guidelines for something as simple as getting the Dhrystone performance benchmark test up and running on a tester, much less getting it in a self-checking, result-signalling format to execute on bare-metal SoC silicon. In this context, 'bare metal' refers to a system environment without any type of kernel or operating system; the benchmark program must be wholly self-contained to allow its execution on the hardware directly. This is necessary at the start of an SoC design to establish correlations of expected integer performance with register transfer level (RTL) simulations.
This process often involves a test bench component , primarily a core instruction bus snooper, along with code simulation to track the number of core instruction cycles consumed while carrying out the iterations of the loops of the Dhrystone code. Then it is necessary to derive the performance numbers through calculations using the number of instruction cycles, or in other cases to print out and gather the time of the execution and other metrics from the log files through the printfs incorporated into the code.
Despite these limitations, the Dhrystone benchmark provides a simple, easy-to-control program with a relatively short execution time per loop iteration. This is attractive in the silicon tester environment because it allows execution time to be easily measured. In the silicon tester environment described in this article, assertion and negation of package port pins can be used to signal various events. This replaces the conventional testbench core bus snoop logic used in an RTL simulation environment to extract performance metrics.
Click here to read more ...