Embedded Systems -> Real-time OS takes 'time' seriously

Real-time OS takes 'time' seriously

Real-time OS takes 'time' seriously
By Steve Furr, Senior Software Engineer, QNX Software Systems Ltd., Kanata, Quebec, EE Times
April 6, 2001 (2:15 p.m. EST)
URL: http://www.eetimes.com/story/OEG20010406S0050

Real-time is an often misunderstood and misapplied property of operating systems. One definition of the term can be found in the FAQ for the comp.real-time newsgroup: "A real-time system is one in which the correctness of the computations not only depends upon the logical correctness of the computation but also upon the time at which the result is produced. If the timing constraints of the system are not met, system failure is said to have occurred."

Real-time, then, is a property of systems where time is literally of the essence. In a real-time system, the value of a computation depends on how timely the answer is. For example, a computation that is completed late has a diminishing value or no value whatsoever; a computation completed early is of no extra value. Real-time is always a matter of degree, since even batch computing systems have a real-time aspect to them-nobody wants to get his or her payroll deposit two weeks late!

< P> Problems arise when there is competition for resources in the system and resources are shared among many activities, which is where we begin to apply the real-time property to operating systems. In implementing any real-time system, a critical step in the process will be the determination of a schedule of activities so that all activities will be completed on time.

Any real-time system will comprise different types of activities: those that can be scheduled, those that cannot be scheduled (such as operating-system facilities and interrupt handlers) and non-real-time activities. If activities that can't be scheduled can execute in preference to those that can be scheduled, they will affect the ability of the system to handle time constraints.

Hard real-time is a property of the timeliness of a computation in the system. A hard real-time constraint in the system is one for which there is no value to a computation if it is late and the effects of a late computation may be catastrophic to the system. Simply put, a hard real-time system is one where all the activities must be completed on time.

What is soft real-time?
Soft real-time is a property of the timeliness of a computation where the value diminishes according to its tardiness. A soft real-time system can tolerate some late answers to soft real-time computations, as long as the value hasn't diminished to zero. Such a system will often carry meta requirements such as a stochastic model of acceptable frequency of late computations. Note that this is very different from conventional applications of the term, which don't account for how late a computation is completed or how frequently this may occur.

The sobriquet "soft real-time" is often improperly applied to operating systems that don't satisfy the necessary conditions for guaranteeing that computations can be completed on time. Such operating systems are best described as quasi-real-time or pseudo-real-time in that they execute real-time activities in preference to others whenever necessary, but don't adequately account for activities that can't be scheduled in the system.

Traditionally, real-time operating systems have been used in mission-critical environments requiring hard real-time capability, where failure to perform activities in a timely manner can result in harm to persons or property.

However, often overlooked are situations where there is a need to meet quality-of-service guarantees, particularly when failure to do so could result in financial penalty. This covers obvious service scenarios, such as "30 minutes or it's free," but it also includes intangible ones, such as lost opportunities or loss of market share.

More and more, real-time is being employed in consumer and Internet-centric computing devices-complex systems that demand the utmost in reliability. For example, a non-real-time device aimed at presenting live video, such as MPEG movies, that depends on software for any part of the delivery of the content may dro p frames at a rate that is perceived by the customer as unacceptable.

In designing systems, developers need to assess whether the performance benefits warrant the use of real-time technology. A decision made early can have unforeseen consequences when overload of the deployed system leads to pathological behavior in which most or none of the activities complete on time, if at all.

Real-time technology can be applied to conventional systems in ways that have a positive impact on the user experience, either by improving the perceived response to certain events or ensuring that important activities execute preferentially with respect to others in the system.

To the best of my knowledge, an acceptable definition of what constitutes a hard real-time operating system has never been put forward. I propose a modest definition based on real-time scheduling theory that is consistent with industry practice: A hard real-time operating system must guarantee that a feasible schedule can be executed given sufficient computational capacity if external factors are discounted. External factors, in this case, are devices that may generate interrupts, including network interfaces that generate interrupts in response to network traffic.

In other words, if a system designer controls the environment of the system, the operating system itself will not be the cause of any tardy computations. We can apply this term to conventional operating systems-which typically execute tasks according to their priority-by referring to scheduling theory and deriving a minimum set of conditions that must be met. Without getting into too much detail, scheduling theory demonstrates that a schedule can be translated into static priority assignments in a way that guarantees timeliness. It does so by dividing the time available into periodic divisions and assuming a certain proportion of each division is reserved for particular real-time activities.

In order to do so, the following basic requirements must be met:

< ol>

Higher-priority tasks always execute in preference to lower-priority ones.

Priority inversions, which may result when a higher-priority task needs a resource allocated to a lower-priority one, are bounded.

Activities that can't be scheduled, including both non-real-time activities and operating-system activities, don't exceed the remaining capacity in any particular division.

Because of the last condition, we must discount those activities outside the control of the operating system, yielding the external-factors provision above.

We can then derive the following operating system requirements:

The OS must support fixed-priority pre-emptive scheduling for tasks (both threads and processes, as applicable).
The OS must provide priority inheritance or priority-ceiling emulation for synchronization primitives.
The OS kernel must be pre-emptible.
Interrupts must have a fixed upper bound on latency. By extension, nested interrupt support is required.
OS system services must execute at a priority determined by the client of the service. All services on which it is dependent must inherit that priority, and priority inversion avoidance must be applied to all shared resources used by the service.

The third and fourth items impose a fixed upper bound on the latency imposed on the onset of any particular real-time activity. The fifth ensures that operating system services themselves-which are internal factors-don't introduce activities that can't be scheduled into the system that could violate Requirement 3.

Inherently predictable
The key characteristic that separates an RTOS from a conventional OS is the predictability that is inherent in all five requirements. A conventional OS, such as Linux, attempts to use a fairness policy in scheduling threads and processes to the CPU. This gives all applications in the system a chance to make progress, but doesn't establish the supremacy of real-time threads in the system or preserve their relative priorities, as is required to guarantee that they will finish on time. Likewise, all priority information is usually lost when a system service, usually performed in a kernel call, is being performed on behalf of the client thread. This results in unpredictable delays preventing an activity from completing on time. By contrast, the microkernel architecture used in the QNX RTOS is designed to deal directly with all of these requirements.

The microkernel itself simply manages processes, and threads, within the system and allows them to communicate with one another. Scheduling is always performed at the thread level and threads are always scheduled according to their fixed priority-or, in the case of priority inversion, by the priority as adjusted by the microkernel to compensate for priority inversions. A high-priority thread that becomes ready to run can pre-empt a lower-priority thread.

Within this framework, all device drivers and operating system services apart from basic sch eduling and interprocess communication exist as separate processes within the system.

All services are accessed through a synchronous message-passing interprocess communication mechanism that allows the receiver to inherit the priority of the client. This priority-inheritance scheme allows operating system requirement No. 5 to be met by carrying the priority of the original real-time activity into all service requests and subsequent device-driver requests.

There is an attendant flexibility available as well. Since Nos. 1 and 5 emphasize that device-driver requests need to operate in priority order, at the priority of the client, throughput for normal operations can be substantially reduced. Using this model, an operating system or device driver can be swapped for a real-time version that satisfies these requirements.

A soft real-time OS must be able to do effectively everything that a hard real-time OS must do. In addition, a soft real-time OS must be able to provide monitoring capabi lities with accurate cost accounting of the tasks in the system. It must determine when activities have failed to complete on time or when they have exceeded their allocated CPU capacity and trigger the appropriate response.

Industry Articles

Embedded Systems -> Real-time OS takes 'time' seriously