Metric Driven Validation, Verification and Test of Embedded Software

Markus Winterholer, Cadence Design Systems
Feldkirchen, Germany

Abstract :

Todayâ€™s complexity of embedded systems is steadily increasing. The growing number of components in a system and the increased communication and synchronization of all components requires reliable verification, validation and testing of each component as well as the system as a whole. Considering todayâ€™s cost sensitivity it is important to find errors as early as possible and to increase the degree of test automation to avoid quality losses because of the increased cost pressure.

Test methods like static code analysis, memory analysis or unit tests offer a high degree of test automation. These techniques are not sufficient when it comes to functional defects: States where the application does not behave as specified. The degree of automation finding these types of errors is mainly limited to code review. The following presents solution how metrics extracted from the specification can be used to increase test automation for complex embedded systems.

INTRODUCTION

Figure 1: From the idea to the final product.

Figure 1 shows an abstract development process of a complex embedded system. Within different development stages, the system is refined until the target system is realized. Software is often first compiled on a host system before it is compiled for the target system. To enable reuse of the test harness and to find bugs earlier, the test environment must provide enough flexibility to support various target systems.

To illustrate the complexity of embedded software verification, consider the following simple code sequence:

Within a loop statement the software functions read and write are called depending on the value of variable f. If we take into account that the value of f can be changed at the beginning of each iteration by a hardware component or a parallel running process, there are 2n possible test sequences. If there is a requirement to test each function with different address ranges and variable data for each function call, the complexity grows even further. Considering there might be a set of interrupts and we want to test the system that each possible interrupt happened at every function call, the complexity is growing exponentially.

The presented solution therefore concentrates on so called functional corner cases, which are hard to be covered by already existing test methods like unit testing or static code analysis.

Functional requirements are described as metrics to make the test quality measurable to obtain a high functional coverage in the given amount of time that can be used to test the system. During a test run on the chosen target, the functional coverage is collected and back annotated to the metric describing the test quality. Measuring test completeness allows the test engineer to concentrate on areas not covered by current tests. To increase test coverage, each test can be varied with random data and control flow elements. This allows even reaching test scenarios which are not described in the specification, but are possible scenarios if the software interface is used as specified in the legal ranges. This avoids holes in the test environment, where only features are tested that have been specified before. In a real world automotive example, the break system interfered with the audio signals carried by an amplitude modulated (AM) radio wave, a problem which was not found during the system integration test because these tests had only specified frequency modulated (FM) radio waves to be tested in combination with the break system tests. Test automation can help to avoid this kind of failures by generating test cases from a test description of all legal input values as well as the control and data dependencies of each component under test. To debug the test, the automated tests have to be reproducible and the test progress has to be measurable using reliable metrics for test quality.

RELATED WORK

Software verification and validation can be done statically using coding guidelines like Misra C [1] or formal verification tools like Prevent [2]. While coding guidelines can be checked by a compiler, the use of formal tools requires some experience in writing properties to make the verification scalable. Dynamic verification is performed during the execution of software, usually called testing [3]. Directed tests, usually written in the same language like the software itself, are executing the software collecting information like code coverage or branch and decision coverage [4].

The presented approach is using a standardized testing language and collects functional and nonfunctional coverage based on a tool first described in [5] [6]. How to combine dynamic testing with formal verification has been discussed in [7].

TESTING EMBEDDED SOFTWARE

A test environment for embedded systems must provide the following features:

Test description language that can cope with the complexity of embedded systems
Adapter to connect to different target systems
Verification planning supporting test quality metrics
Ability to measure the metrics covered by a test run

Figure 2 shows the iterative testing process used for testing complex systems. The test process starts with a definition of metrics to measure test quality and the specification of legal input ranges defined as constraints for each software component. A verification Plan (vPlan) is used to refine the textual representation of the requirements and constraints into a technical representation that can be used during the testing process. The vPlan is also used to measure the test progress by back annotating the results of each test run the vPlan. Derived from the vPlan is a set of tests, implementing the test harness. Reproducible randomization multiplies the test scenarios to achieve maximal functional coverage during the test run in a given amount of time. Back annotation of the coverage results of multiple runs results to the defined metrics allows the test engineer to analyze the quality of the test harness. Unmatched metrics can be improved by defining new test scenarios, running more tests or improving the constraint declaration of the legal input.

Figure 2: Automated software testing flow

TEST DESCRIPTION LANGUAGE

We propose the usage of the IEEE 1647 verification language called â€œeâ€. This language was originally developed to describe hardware test benches. It provides concepts like aspect orientation, reproducible randomization and parallelism which make it suitable for testing complex embedded software.

Parallelism: Concurrent events within a system have to be described and tested. This requires synchronization constructs and the possibility to describe timing.
Complex Datatypes: Reuse of test components and increased code maintainability.
Randomization: Variations of test cases can be combined into a single randomized test description in order to reduce the number of tests to be manually maintained. Related to the example with the loop and the functions read and write means that instead of writing 2n tests, there is only one test which calls during each loop iteration either read or write. Also the parameter ranges can be randomized. The reproducibility is guaranteed by so called seeds. If a test is restarted with the same seed, the test data and control flow is regenerated identically.
Constraints: To avoid illegal values to be generated, the randomization is using constraints to describe legal value ranges. In our example we can constrain a function call to the subset [read, write] of all available software functions, or constraint the address range to be [0x0..0xFF].
Checks: The test language provides statements to compare results with expected results and to issue meaningful error messages to reproduce and debug the failure.
Coverage: Functional coverage is used to collect quality metric results during the test run. Events are used to trigger coverage collection automatically during the test run. In the example an event is issued each time a software function is called. This event is used to collect coverage information about the name of the function:

Coverage is expressed as a percentage value. Coverage has the value 0% if there was neither function called, 50% if one of the two functions were called and 100% if there was at least one read and one write call.

Coverage also can be defined as transitions between two coverage events. This can be used to make sure that there were all possible transitions between read and write calls after each iteration.

In this case 100% coverage is achieved when there are all possible transitions taken (r-r,r-w,w-r,r-r). The coverage results can be combined using cross coverage:

In this case 100% coverage means all possible interrupts were collected during the execution of both software function calls.

Using a distinct testing language has the advantage of separating test harness and implementation as well as offering special language statements for testing. One disadvantage is the requirement to convert data types between one language and the other. Also the interface between test harness and software under test has to be well defined. Data type conversion, software activation and control, as well as coverage collection and check is done in an exchangeable adapter.

TEST HARNESS ADAPTER CONCEPT

Each processor of the system is connected to the test harness by a so called software adapter. The adapter controls the communication between the system and the test harness. Since the adapter interface itself is standardized, the test harness can be reused if the software is running on another target without the need to change the test harness itself. Figure 3 is showing a distributed system, executing software on four different processor implementations. Each processor is modeled differently.

Figure 3: Test environment.

Each adapter is aware of the software under test and converts the data types to be exchanged between test harness and embedded software. The test adapter controls the software calling functions and writing variables. It also monitors the system behavior during the test to perform cheeks and collect coverage information. The communication between the software and the adapter is loosely coupled. Both, software and test adapter are running independently synchronized through shared memory. All test adapter in the system are running in parallel but can be synchronized to communicate with other adapter in the system. Since each adapter runs independently, the execution speed of each processor can vary. In the example in Figure 3, one CPU is modeled in SystemC, while three others are represented by the RTL implementation. Two processors are accelerated using different hardware emulation solutions to improve the execution speed of the software.

This generic adapter approach allows connecting the testbench to any system that provides an API to access the system memory area. The connection can be implemented locally on the host machine or remotely over Ethernet or via a JTAG interface.

DYNAMIC TEST EXECUTION

Dynamic randomized test execution is used to generate on the fly variations of the test sequences derived from the vPlan. The test library contains an easy to maintain set of abstract test descriptions. The sequences are used together with the set of constraints to drive the software from the test environment.

Figure 4: Dynamic randomized test execution.

Dynamic test variations are possible during the test run since test environment and software can be synchronized from both sides. The test environment can control the test run before and after a function call. On the other hand, the software can call the test environment whenever the software is active.

This allows each adapter to change the test depending on the state of each other software adapter or the state of the whole system. Therefore corner cases of the system can be tested faster than with traditional directed testing methods, where concurrent tasks are difficult to control. With a given seed the random test sequence can be reproduced as often as required to debug a failing test.

Monitoring the software allows to implement checks and collect coverage to measure the test progress and quality.

SUMMARY

Finding functional bugs in a system as early as possible can be automated using the proposed solution. The process helps to reduce test complexity for distributed embedded systems. Metric-driven verification, validation and test can be used in addition to existing software test methods like unit tests, code coverage or static analysis. It is also possible to combine the results of these techniques to get an overall test progress overview of the test quality.

REFERENCES

[1] Motor Industry Software Reliability Association http://www.misra.org.uk

[2] Coverity Prevent http://www.coverity.com

[3] Spillner et al: Software Testing Foundations. Rocky Nook, 2007

[4] Gcov - Code coverage test for GCC compiler http://gcc.gnu.org

[5] Winterholer M.: Metric-Driven Functional Verification of Software for Embedded Systems; Embedded World Conference, NÃ¼rnberg, 2009

[6] J. Andrews: Co-Verification of Hardware and Software for Arm Soc Design. Newnes, 2005

[7] D. Lettnin et al: Coverage Driven Verification Applied to Embedded Software; IEEE Symposium on VLSI, Porto Allegre, 2007

Industry Articles

Metric Driven Validation, Verification and Test of Embedded Software