'Protection domains' raise integrity
'Protection domains' raise integrity
By John Fogelin, Vice President and General Manager, Maarten Koning, Principal Technologist, Gerry Kuhn, Manager, Platform Components Group, Wind River Systems Inc., Alameda, Calif., EE Times
April 6, 2001 (2:11 p.m. EST)
In the post-PC era, the role of smart devices is expanding every day. These devices depend on access to the Internet to function; as reliance on the Internet infrastructure increases, the standards of reliability and availability have never been more stringent.
For such devices, society simply will not tolerate the low standards of dependability set by the PC. Their success depends on a new design paradigm that considers reliability and availability in virtually every aspect of the devices' software and hardware content.
In moving toward that paradigm for VxWorks AE, Wind River's new flagship real-time operating system (RTOS), we took extreme care to leverage over 12 years of code integrity inherent to the original RTOS while adding capabilities to ensure its high availability.
The new requirements of the emerging market for embedded Internet-centric smart devices moved us to develop a hardware-enforced protection model, based on "protection domains," that is more powerful than legacy protection models such as trap-based system calls or intrasystem message passing. Compared with previous models, the protection domain model developed by Wind River is based on a function call invocation mechanism that is less intrusive, more easily understood and performs better on modern processor architectures in terms of both required memory and execution time.
It also supports automated resource reclamation to prevent resource leaks that could otherwise occur over time in a dynamic system. In addition, it has built-in distributed messaging that provides fault-tolerant hooks and includes local transparent service points on multimode applications, thereby providing the basis for service migration.
VxWorks AE provides a fault-detection, processing and recovery framework to allow systems to continue operating in the face of failure. This framework can be leveraged by the system, by applications and by device driver writers to handle events at any level. It will form the foundation for a future high-availability product from Wind River, which will be designed to provide support for CompactPCI Hot Swap for managing hot-swap insertion, extraction and other related events on platform-support packages for cPCI high-availability systems.
The technical innovations of the new RTOS revolve around the core design principle of protection domains. Most protected systems have a limited concept of application protection. Traditionally, a bit of code or a program could be protected and isolated from the RTOS and from other applications by putting it in an isolated container called a process, where the only access into or out of the application is through system calls. Also, many systems can share code among multiple processes by putting that code in a shared library.
In the design of the new system th e protection domain approach generalizes these traditional techniques and considerably increases their power and utility. This significant technical advance is a key simplification that unifies protection into a single superior model. The protection domain concept introduces a new resource container that defines an execution environment.
Developers can now separate applications, shared libraries, shared data and system software to varying degrees in order to attain the desired level of isolation and protection. Either at system startup or dynamically at run-time, applications can create protection domains to encapsulate resources within the system. These resources can include tasks, physical and virtual memory pages, kernel objects such as semaphores, message queues and file descriptors, pure code (shared libraries) and pure data (shared data).
Each protection domain is essentially its own address space. By default, each protection domain is isolated from all others in terms of access rights and so on. Beyond that, protection domains can offer various levels of access to other protection domains.
A protection domain can, for example, publish entry points and make these available for linkage by other protection domains. Best of all, the protection mechanisms of VxWorks AE are virtually transparent to applications that typically execute within their protection domain boundaries. Once such a domain is created with its essential parameters and access rights, an application written for VxWorks can execute within protection domains using the same application programming interfaces (APIs). The VxWorks AE loader establishes all of the permitted linkages to the OS and any shared libraries within the system.
Because of the fundamentally different characteristics of the new embedded computing model of smart devices operating in a connected environment, we chose to implement a very flexible programming environment based on a function-call-based invocation mechanism. This is fundamentally dif ferent from traditional mainstream operating systems such as Windows and Unix that, while offering extensive memory protection and application isolation, have very rigid programming models. Under those models, application code was designed from the outset to be either user (application) or supervisor (kernel) code. Once developed, it was very difficult to migrate user code to supervisor code, or vice versa.
It is generally not simple, even if possible, for a customer to alter the characteristics of a monolithic operating system such as Unix. To change and enhance the core API of Unix, or to replace the basic interface at the programming level of the Unix graphical user interface or extend the capabilities of the system level, minimally requires special programming techniques to alter the program for system execution instead of application execution. However, a number of factors weaken or eliminate the advantages of a fixed interface in the new Net-centric computing environment that is emerging.
< P> First, it is extremely unlikely that the mass market will accept the same kind of failures that are commonplace on PCs: incompatibilities, viruses, version mismatches and just plain crashes. Second, market forces continue to fragment the smart-device market and very little comparable convergence is taking place to move to one all-encompassing embedded Internet platform. Third, the scope of most applications will be limited; in contrast, the nature of PC third-party applications is completely arbitrary.
Fourth, there are very specific, narrow and well-conditioned interfaces as opposed to the very broad and general-purpose interfaces of Win32. And finally, developers will be forced to go through very rigorous testing and qualification procedures before they will be permitted by the platform vendor to be loaded on an embedded platform.
The most effective and flexible way to provide the right partitioning environment is to use function call-based invocation, combined with task migration. The f unction call is the correct lowest common denominator for intraprocessor invocation between software components in a general protected RTOS. The software components may be configured to be on the other side of a protection boundary (such as in the kernel) or not. In order to maintain flexibility, the invocation mechanism should not make assumptions that message passing is required or assume that a system call exception need occur.
The innovation that Wind River has made in VxWorks AE is that the caller's memory context is used to trigger an exception while it is crossing an interprotection-domain boundary. But it only does so if the caller does not already have access to the other person's address space. The operating system uses linkage tables to manage this mechanism on behalf of the application. The system designer has control over where those boundaries go; thus, protection can be dialed in where and when it is needed.
This approach has four performance and software-management advantages when traversing protection boundaries in an RTOS.
- First, a task context switch is not required to enter the kernel protection domain, just a protection switch, which is much faster.
- Second, parameters do not have to be copied from one stack to another, since the caller's stack migrates with the task.
- Third, parameter setup and return-value copy-back code in the system-call trampoline code (stub routines) is not required.
- Fourth, slow message packing and unpacking code (as performed in message-passing OSes) in the system-call trampoline code is not required.
In regard to software management, the linkage table becomes a powerful mechanism for performing such software-management functions as software upgrades or system instrumentation, and for providing debugging features. In addition, the location of the caller and the called party does not enter into the software development process; this is a system partitioning and tuning decision that can be considered later. This is especially useful for integration platforms that must absorb software from third parties, which are free to write their software without making deployment assumptions regarding system partitioning.
It is the application developer who decides how much and where to apply the partitioning and protection in the system. The kernel protection domain and any shared library or shared data domains are carved out of unique virtual addresses. Application protection domains, where multitasking application code runs, actually overlap in the virtual address space.
When one application is running in a given protection domain, other application protection domains cannot be "seen," which provides for powerful application isolation. Each application protection domain has its own symbol table and its own heap, so developers can load and run multiple independent copies of the same code simultaneously by using multiple protection domains.
Each protection domain supports any number of tasks. Thus , if the protection domain contains code that is reentrant, multiple instantiations of a program can be executed simultaneously within a protection domain by spawning multiple tasks in that protection domain, each with the same entry point.
Additionally, if code for multiple programs is loaded into the same protection domain, multiple tasks can be spawned in that protection domain, each with separate entry points. This flexibility gives the system designer significant latitude to partition the system into protection domains as he or she sees fit.
Providing stack overflow detection for each task and allowing protection domains to "auto-grow" their heap (within limits) in case of overflow offers a measure of overrun protection. Also, the system allows for limiting task priority ranges within protection domains. In that way, runaway applications are bridled and do not affect other applications. Interprocess communication mechanisms such as queues, semaphores and pipes are still global objects that allow coordination between applications.
In VxWorks AE, the protection domain also defines ownership of system resources (such as tasks, message queues, semaphores, memory pages, etc.).Home domain
The protection domain in which resources are allocated or created becomes the home domain for those resources. This association supports an important attribute of applications, the automated reclamation of resources.
Whenever an application is terminated in a protection domain, all resources allocated to that domain are reclaimed by the system for reuse. This is made possible by the ownership association of the protection domain boundary and is an important characteristic for system upgrades as well as fault-recovery mechanisms and general system stability.
Previously, resource reclamation in VxWorks was something of a manual process, performed on an object-by-object basis (for example, create a task, delete a task, create a queue, delete a queue). In AE we provided a more powerful and intuitive level by allowing the ownership of all resources (memory, file descriptors and so on) to be tracked on individual user-defined lines, bounded by the protection domain container.Distributed messaging
VxWorks AE incorporates a new scalable feature for extending local message passing (native message queues) beyond single processor boundaries. This distributed message-passing facility is completely transparent, since programmers use the same interfaces provided in VxWorks without regard to the location of the message object.
The new RTOS automatically determines whether the queue is local to the processor and forwards messages to remote queues using a distributed and redundant name service or registry. Those distributed objects are simply named service points that are referenced in the application. In keeping with Wind River's flexible approach to software design, distributed messaging can be enabled over any loosely coupled media.
A transport adapter layer i s provided in source code, with ample documentation, to allow servicing over custom media.
By default, the out-of-box implementation provided utilizes User Datagram Protocol datagrams over Ethernet and shared memory transports. Reliable message passing is inherently provided because the protocol supports message acknowledge and packet sequence numbers to ensure that messages arrive intact. Distributed messaging in VxWorks AE provides developers with key APIs to enable fault-tolerant behavior in that it leverages a replicated name server database for routing messages remotely. There is no single point of failure in the system.
Copyright © 2003 CMP Media, LLC | Privacy Statement