114x Filetype PDF File size 0.12 MB Source: www2.hawaii.edu
Blueprint for an Embedded Systems Programming Language Paul Soulier Depeng Li Dept. of Information and Computer Sciences Dept. of Information and Computer Sciences University of Hawaii, Manoa University of Hawaii, Manoa Honolulu, Hawaii 96822 Honolulu, Hawaii 96822 Email: psoulier@hawaii.edu Email: depengli@hawaii.edu Abstract Given the significant role embedded systems have, the implications of faulty software is clearly evident. Embedded systems have become ubiquitous and are However, software logic errors are not the only manner found in numerous application domains such as sen- in which a system can malfunction. The Stuxnet virus sor networks, medical devices, and smart appliances. [14] is an example of a failure caused by a malicious Software flaws in such systems can range from minor software attack that ultimately caused an industrial nuisances to critical security failures and malfunc- control system to destroy itself. As Internet connectiv- tions. Additionally, the computational power found in ity becomes increasingly common in embedded sys- these devices has seen tremendous growth and will tems, they too will be susceptible to software-based likely continue to advance. With increasingly powerful security exploits. hardware, the ability to express complex ideas and Manyareas of software development have benefitted concepts in code becomes more important. Given the from the improvements made to programming lan- importance of developing safe and secure software guages. Modern languages are more capable of detect- for these applications, it is interesting to observe that ing errors at compile-time through their type system the vast majority of software for these devices is and many of the low-level and error prone aspects of written in the C programming language —an inher- programming have been abstracted away. This enables ently unsafe language as compared to other modern the development of more reliable and complex applica- languages. This paper examines the characteristics tions. Embedded systems are an exception to this. The and requirements that uniquely differentiate embedded vast majority of embedded systems are still developed systems from other application domains. The result using the decades-old C programming language —an is a blueprint for a modern, high-level programming inherently unsafe language with only the basic features language specifically designed for embedded systems. of the imperative paradigm. Despite its flaws, C has stubbornly remained the 1. Introduction de facto standard for embedded system development. The reasons for this are difficult to identify (al- though a massive existing code base and the lack Embedded systems exist in a multitude of appli- of a compelling replacement could be factors). Vari- cations and advances in hardware technology will ous language-based approaches have been created to continue to make them capable of greater degrees of address many of the shortcomings of C —all with sophistication and intelligent functionality. While often varying degrees of success as they apply to low- unnoticed and unseen, these systems are responsible level programming. These solutions tend to focus only for properly controlling medical devices, automobile on a subset of all the issues involved with low- braking systems, industrial control systems, and nu- level software development while not considering other merous other cyber-physical systems that interact with critical aspects. What is missing is a cohesive and the world in profound ways. The growing fields of sen- practical language that effectively incorporates all of sor networks and the Internet of Things combined with these methods and techniques in a manner consistent ubiquitous Internet connectivity will further expand the with the needs of embedded systems. use of embedded systems. Onlanguage design Landin [9] remarks in the paper “The Next 700 Programming Languages” that with potentially thousands of vehicles being recalled. “...we must systematize their design so that With these strict requirements, the need to successfully a new language is a point chosen from a develop and release error-free software very important. well-mapped space, rather than a laboriously devised construction.” 2.2. Data Layout and Representation Todesign a compelling language to replace C, we must first identify what the language needs to look like. In most applications, developers are really just con- With this in mind, the contribution of this paper is a cerned with what data needs to be stored in a structure; description of the features and constructs necessary in where the data goes and how much room it takes a language designed to implement secure and reliable is generally unimportant. Embedded systems, on the embedded systems software. other hand, care a lot about where data is located in a This paper is structured as follows. Section 2 de- structure and how much space it consumes. scribes the characteristics that differentiate embedded The ability to specify the size and location of data systems from other programming disciplines. Section is necessary when defining language-based structures 3 presents the blueprint of a programming language that must match hardware structures or standardized designed to produce secure and reliable embedded protocols. Data layout is also an important tool for tun- software. Section 4 highlights related research that has ing. Organizing a structure to improve memory locality attempted to address the shortcomings of programming based on knowledge of the CPU cache architecture languages for embedded systems, and finally, section or runtime access can have a significant performance 5 concludes. impact. The ability to represent and manipulate data 2. Characteristics of Embedded Systems in arbitrary ways is a fundamental aspect of writing embedded systems code. Embedded systems have a number of characteristics 2.3. Hardware Interaction that differentiate them from other application domains. These particularities make most programming lan- guages ill-suited for this type of software development. Embedded systems interact directly with hardware High-level languages generally attempt to provide through memory-mapped IO or low-level CPU instruc- helpful abstractions for tasks that can be automated tions. In the case of memory-mapped IO, hardware by the compiler or those that are error-prone. While registers appear as normal memory, but may behave these abstractions can improve development efficiency in ways that are not entirely consistent with regular and reduce errors, they have the unfortunate side memory. Consider, for example, a hardware device that effect of obscuring the low-level details that embedded accepts a 32-bit value from the host through a 16-bit systems must deal with. The necessity to interact memory-mapped register. Listing 1 shows pseudocode directly with hardware, specify the organization of data that could be used to send this value to the device. within a structure, operate with limited resources, and The host writes the most significant 16-bit value to performance requirements are all elements that bring the register followed by the least significant value. a unique set of challenges to developing this type of software. This section examines the various aspects of u16 reg = device->input_reg; embedded systems that necessitates a domain-specific * language. reg = val >> 16; * reg = (val & 0xffff) * 2.1. Safety and Reliability Listing 1. “Interfacing with Hardware” Asmentionedintheintroduction,embeddedsystems are often found in devices that can have a major The compiler, unaware that the memory location is impact to the physical world. It is frequently required different from others, may eliminate the first assign- that these systems operate without error and with no ment upon seeing the same memory address is im- down-time. Furthermore, if a problem is found and a mediately overwritten by another value. Idiosyncrasies software fix is identified, upgrades can be difficult and such as this are common in embedded systems where sometimes impossible. Consider faulty software in an unique hardware properties do not always match the automobile component —the implications are massive abstract machine of the programming language. 2.4. Transparent Expression can have a profound impact to a programs ability to achieve its goals. These small performance gains Transparency is a trait of language expression that often come from specific knowledge of a system and is particularly import to embedded systems. One of the the programmer’s ability to generate appropriate code strengths of the C programming language is that it is rather than a compiler’s optimizer. Performance can be easy for the programmer to conceptualize how source a major influence in the overall design of an embedded code will translate into machine instructions and data system. structures. This ability becomes very important when attempting to fit code or data into resource-limited 3. Language Blueprint hardware, gaining additional performance, or interface with hardware. Expressiveness can be described as the property of a 2.5. Constrained Environment language that allows a programmer to effectively trans- late concepts and ideas into code. The more expressive Embedded systems are almost always constrained in a language is, the easier it is for a programmer to some fashion. The most common limitations encoun- realize a solution to a problem. This trait is domain tered are computational power (memory and proces- specific; what constitutes an expressive language in sor), time, and energy. Advances in hardware tech- one domain does not make it expressive in another. nology have come a long way in easing some of For example, assembly language is very expressive as these constraints, but they are still a concern for many compared to Javascript for executing a specific CPU systems. instruction. Conversely, Javascript is far more capable Time is an interesting constraint when considered in of describing a web application than is assembly. the context of cyber-physical systems. An occasional This section describes the features and constructs half second delay in a desktop application probably that make a language expressive when considering the wouldn’t be noticed. An equivalent delay in a real- characteristics of embedded systems. These features time system such as an electronic breaking system collectively form a blueprint that describe the features or avionics fly-by-wire system could have serious of a language ideally suited for embedded systems. consequences. Many embedded systems have real-time 3.1. Paradigm deadlines that must always be met. Energy is another constraint that can have a signif- icant impact to embedded systems. Sensor networks The functional language paradigm has many useful and other devices that rely on battery power have a properties —particularly referencial transparency (code finite lifetime before they stop working. Power con- has no side effects and there is no global state that sumption must be managed to maximize operational can change). This property, among others, has many time. Highly energy-efficient devices also tend to be compelling benefits. However, the functional paradigm very limited in memory and processing power. is somewhat at odds with embedded systems. Em- Constraints are driven by a number of factors. Some, bedded systems are state-full by nature. They interact such as time and physical dimensions, are governed with hardware components that are themselves state by the laws of physic. Other limitations are driven machines. The advantages of the functional paradigm by business factors that require the use less powerful are unarguably valuable. However applying it to sys- hardware to save on manufacturing costs. Regardless tems that are defined by state would likely be ineffec- of why limitations are present, they introduce unique tive. Conventional wisdom would suggest the impera- challenges to embedded system development. tive programming paradigm, which is based on state change, and is a natural choice for a embedded system 2.6. Performance programs. Object oriented programming, while not strictly nec- Some devices perform computationally intensive essary, can be useful for embedded systems. Object- operations or transmit data at high speeds. In such orientation has proven to be a valuable method of cases, performance is a critical design goal where the reasoning about complex systems. Additionally, OO difference of a few percentage points can determine techniques can be effective at eliminating some of the the success or failure of a product. Consequently, the unsafe idioms used in C. For example, C programmers ability to save a few bytes in a data structure or sometimes use typeless “void” pointers, type casts, and eliminate a few microseconds from a section of code unions to achieve unsafe versions of polymorphism. Inheritance and sub-typing provided by OOP is a type- int x=foo(), y=bar(); safe alternative. The object-oriented features supported by the lan- if (x & y) { guage must include the basics of the paradigm: data // do stuff... encapsulation/abstraction, dynamic binding of function } calls, and inheritance/derivation. Each of these features is relatively transparent to the programmer in terms of // vs. overhead and the underlying code that is generated. if (x && y) { Dynamic binding can impact runtime performance and memory overhead, but these concerns are generally // do stuff... insignificant. Achieving equivalent functionality using } standard imperative techniques will generally incur the Listing 2. ”Syntax and Semantics” same costs. The concern for embedded systems is that the constructs generated by the compiler are hidden from the developer and reduce transparency. Syntax and semantics must be clear and unambigu- Multiple inheritance can pose significant challenges ous. Although more convenient, preserving the error- and is worth mentioning. It is a frequently debated prone syntax and idioms of C to avoid learning a new feature with the primary point of conflict being that of language isn’t justified. In terms of language design, its utility compared to its drawbacks. When examining there is no reason for the types of ambiguities shown the implementation of multiple inheritance in C++ in Listing 2. Language syntax and semantics must be [17], [16], the effects on performance and memory designed to prevent these types of programming errors. overhead are not trivial. Multiple inheritance affects the organization of code and data structures; which 3.3. Type System can have a negative impact to both performance and data layout. Due to this, M.I. is not a good choice for embedded systems. There are alternatives (such as A type in a programming language is a form of “interfaces”) that achieve functionally similar results, specification that defines various characteristics of the but without the same overhead. constructs within a language. A type system is the mechanism used to enforce that all type specifications 3.2. Syntax and Semantics are correctly adhered to. The primary role of a type system is to help promote program correctness and reduce bugs. This section describes the basic properties Asomewhat interesting trend of languages designed a type system as well as addressing some specific types to replace C is the goal of retaining the same syntax that deserve special consideration. and programming idioms. This is interesting because Static Type System Static type systems are essential Cand its associated idioms are often unsafe. Consider for embedded systems. Dynamic type systems are un- the C code in Listing 2. Both if statements are syntac- desirable as they can leave latent type errors undetected tically correct but are semantically very different. The until runtime. These errors are often unrecoverable logical “AND” operator && and the bitwise “AND” op- and result in program failure. Conversely, static type erator &, while visually similar, have different runtime systems attempt to enforce the type rules at compile behavior. Accidentally adding or omitting a & character time. Static type systems allow type correctness to is an easy mistake to make and the compiler has no be verified earlier in the development process. While way to know which is correct. potentially requiring more effort on behalf of the pro- Pointer arithmetic is another example of syntax that grammer to properly define the type specifications, this is unsafe and also largely unnecessary. A common C results in systems with fewer bugs. Due to the nature of idiom is to use pointer arithmetic as a way to iterate embedded systems, namely the difficulty of updating over a subset of an array. It’s also used as an opti- software and the implications of software failures, it is mization technique to eliminate array references. This more important to identify errors early. Consequently, idiom is also responsible for numerous bugs. There are a language for embedded systems should be statically plenty of examples of safe syntax in other languages typed. for expressing ranges of arrays and modern compilers Type and Memory Safety A type and memory can usually optimize array access more effectively than safe language is critical to minimizing program flaws. humans can. Type and memory violations (e.g.: out-of-bounds array
no reviews yet
Please Login to review.