When considering the alternatives, its important to consider issues like multithreading and debugging support. Regardless of whichever option emerges as the preferred form, two key issues must still be addressed by any embedded Java programming environment: how to provide determinism and how to interface to hardware.
One of Javas strengths is its reasonably clean syntax that is strongly reminiscent of C or C++. So although its a new language, its familiar. Getting up to speed with Java is easy.
More importantly, Java is both object-oriented and strongly typed. Everything in Java is an object and there are no loopholes to circumvent Javas strong typing. Since the advent of C++, these features are considered essential in a programming language because they contribute enormously to the correctness of programs.
Anecdotal evidence bandied about in Java newsgroups and mailing lists suggests that developers take less time to produce a working Java program than a program in C or C++. Debugging is also easier because Java has removed a prolific source of hard-to-find bugs, including those related to the incorrect use of pointers.
Example bugs include memory leaks and memory access errors (wild pointers, referencing freed memory, returning a pointer to a local variable, etc.). Java doesnt allow its pointer equivalent (i.e., object references) to be manipulated in the same way as pointers are in C or C++, and it provides automatic garbage collection.
Another strength of Java is its large reusable code base. In the standard distribution, Java supports threads, TCP/IP networking, and remote invocation. It even has a full set of classes for building GUIs.
Additional APIs support a variety of needs, such as database access, communication, multimedia, a way to use GUI components, and security.
With Javas strengths as a language, a development environment, and a reusable code base, its easy to see why developersand not just embedded-system developersare eager to put it to use.
Unfortunately, as I mentioned, desktop Java has some drawbacks when used in embedded systems. Although Java was originally intended for use in set-top boxes, it was first used in a web browser, which is a desktop application.
First, desktop Java is too big for embedded applications. Not only must the entire Java virtual machine (JVM) be present, but a Java interpreter or a just-in-time (JIT) compiler must be present as well.
On top of that, all the standard classes must be present. These take up to 8 MB on disk, more when loaded. Fonts take even more space.
The bottom line is that desktop Java needs on the order of 16 MB just to run, and the application needs are additional. Very few embedded systems have that kind of memory available.
Also, Java is too slow. Suns first releases were usually more than 30× slower than equivalent C code. Subsequent releases, which use JIT compilers, are significantly faster but still perhaps 5× slower than equivalent C. If youre used to squeezing out the last few cycles out of a processor, this is a heavy penalty to pay just to use Java.
But, the most important drawback of desktop Java is that it doesnt meet the constraints of most embedded systems. One such constraint is the requirement for real-time behavior (i.e., execution thats both predictable and bounded in duration).
Many embedded systems have severe real-time requirements. For instance, the collision-detection system on a jetliner has seconds in which to respond. Computation must finish in a certain amount of time, so execution has to be predictable.
Another constraint of embedded systems is their limited resources. Consumer devices, which may be manufactured in the millions, are very sensitive to cost, so designers tend to use the smallest processor and the smallest amount of memory possible to do the job. A programming language thats slow and uses up a lot of memory just isnt competitive with existing alternatives.
More importantly, Java doesnt possess the notion of an address. Embedded systems, almost by definition, are required to access hardware. Most often, that hardware is accessed by referring to a specific address. Because addresses arent part of Java, you have to go outside the language to overcome this constraint.
Finally, desktop Java has some attributes that get in the way of successful use in embedded systems. These attributes may be useful and even necessary in desktop systems, but not in embedded systems.
For instance, Java is interpreted (the source of much of its slowness) and it is dynamic because it supports the downloading of new classes on-the-fly. Java is portable across many different systems because its source code is compiled, not to native code, but to bytecodes, an architecture-neutral format. Also, Java supports a comprehensive security model designed to prevent many kinds of attacks.
However, for embedded systems, which frequently exist in completely closed environments, portability and security arent issues. Unless an embedded system is connected to a network, the ability to load new classes dynamically is useless.
These attributes of desktop Java prevent its use in embedded systems. And, the issues of performance, memory consumption, and poor real-time behavior make it hard to retarget the desktop version to an embedded system.
What are the alternatives? How can an embedded-sytems developer use the great features of Java without quadrupling the systems cost or writing piles of non-Java code?
Essentially, there are only three options: use a special-purpose JVM, use a JVM with a JIT compiler, or use compiled Java instead of some form of interpreted Java.
Many vendors have come up with specially tailored versions of Java that are a better fit for the needs of embedded developers. For instance, Sun offers PersonalJava for systems with 24 MB of memory and EmbeddedJava for smaller systems (Mentor Graphics Microtec Division is a licensee of PersonalJava). Hewlett-Packard, NSI Com, Insignia Solutions, NewMonics, and others have similar offerings.
Another approach, even in versions tailored for embedded use, is to use a dynamic compilation technique, typically a JIT compiler, to increase performance. But, there are several tradeoffs involved.
First, JVMs with JITs have potentially longer start-up times because the JIT compiler has to compile Java bytecodes into native machine language before executing. Secondly, its difficult to do a good job of optimizing native code while keeping memory consumption low. The more optimizations that are done, the larger and slower the JIT compiler becomes.
Several vendors provide knobs to tune the dynamic compilation process so you can choose on a case-by-case basis exactly what the performance, memory consumption, and start-up-time tradeoffs are going to be.
But, for some embedded applications, even a JVM with dynamic compilation is too slow and takes up too much memory. One option that is increasingly being considered is compiling Java directly to a native machine language, thereby eliminating both the JVM and either the interpreter or the JIT compiler.
Of course, the resulting application is no longer portable, but embedded developers typically dont care about portability. For a given design, their application needs to run on a single well-known hardware configuration.
The other attribute that compiled Java forces a developer to give up is the ability to load new classes on-the-fly. Because all the code is precompiled, theres no facility for dynamic loading of classes. Again, this issue probably isnt too serious for embedded-system developers, most of whom dont want random classes downloaded onto their system.
If youre willing to tolerate the lack of portability and the lack of dynamic class loading, you can still reap all the benefits of Java as a great language and keep the system small and fastthat is, if you can resolve the issues of determinism and low-level programming.
Any version of Java for embedded systems must first be deterministic and predictable. It also has to be able to access memory directly.
One bugaboo of embedded systems is ensuring real-time response. In the case of Java-based systems, the primary cause of nondeterminism is the garbage collector.
In desktop systems, it doesnt matter much that the JVM stops for several seconds to collect unused memory. But in an embedded system, several seconds can be the difference between correct operation and the loss of human life.
The biggest threat to an embedded operation is that most garbage collectors work in whats called stop-the-world mode. Usually, the collector is called only when an allocation fails because memory is exhausted. Therefore, allocation time is impossible to predict, and when the collector is running, no other processing is being done. This situation is unacceptable in a real-time system.
An obvious solution is to have the garbage collector run concurrently with the application so that the impact of garbage collection is spread around more evenly. This way, time-critical events are processed in a timely manner.
Ensuring real-time response still isnt enough to make Java useful for developing embedded systems. Because the added value of embedded systems is their specialized hardware, the embedded software must always be able access or control the hardware, which requires an extension to the Java through a Java Native Interface (JNI) with several possible options or through a nonstandard extension of the Java language.
Suns JNI permits portability across different JVMs on a particular processor architecture, but it suffers from poor efficiency. An earlier version of JNI was more efficient, but it required the Java code to know the layout of an object in memory. In any case, it makes sense for an embedded Java system to offer a package specifically for accessing physical memory.
The Java language requires garbage collection of unused objects and theres no corresponding delete operator to go with the new operator. One advantage of garbage collection is that you cant have bugs in your memory allocation if the responsibility for detecting unused memory and reallocating it is automatic. The application simply clears a reference to memory to make it available for future use.
However, the advantages of garbage collection come with a price. Finding unused memory and freeing it can take a long time, causing critical deadlines to be missed in a real-time environment. A garbage collector for a real-time system must be predictable and fast in addition to allowing high-priority threads to run. Unfortunately, most collectors fail on all three of these requirements.
Garbage collectors work oppositely of what their name implies. They find all the memory blocks that are in use and free up whats left over. There are many different algorithms for garbage collection, but most of them share the following steps:
scan the local and statically allocated variables for pointers to the heap
mark each memory object that can be reached from these pointers
scan each marked memory object for pointers to the heap
repeat steps 2 and 3 until no new pointers are found
sweep the heap and free up any memory that is not marked
A major problem with concurrent garbage collection is that while the collector is scanning the heap for pointers, the application is changing those same pointers. In essence, the entire heap is a critical section.
Suppose an application is manipulating a linked list of three objects, as illustrated in Figure 1. Object A points to object B, which points to object C. The garbage collector scans object A for pointers but has yet to scan objects B or C. The application then deletes object B by copying the pointer to C to object A.
![]() Figure 1Object C is lost if the application changes pointers between garbage-collection scans. |
The collector hasnt scanned object C, nor has it scanned object B, so it wont find any pointers to object C because it already scanned object A. When the collector completes scanning, it frees object C even though theres a live pointer to it in object A.
Interference between the application and garbage collection is the only situation you have to worry about. Other accesses to the heap dont affect garbage collection. To make concurrent garbage collection work, the application must tell the collector every time it writes an object pointer to another memory object. This is called a write barrier.
Write barriers sound expensive, but there are several ways to speed things up. Every allocated object on the heap has a flag word containing the state of that memory object. There are three possible states: black, gray, or white.
The collector knows a black-state object is live and has scanned it for pointers. The collector knows a gray-state object is live but has not yet scanned it for pointers. In the white state, the collector has not yet found a pointer to the object.
There are several variations on how a write barrier is implemented. Listing 1 is one example of a write barrier. The if statement is generated automatically by the compiler, so its not necessary to put in write barriers by hand.
| object_a->next =
object_c; if(object_c != NULL) if(object_c->garbage_flags == WHITE){ object_c->garbage_flags = GRAY; gc_make_gray(object_c); } Listing 1Since a compiler knows about every write to memory, it inserts a write barrier automatically. |
Every pointer assignment to the heap has two additional tests. Usually, programs manipulate the same pointers multiple times. The garbage_flags and gc_make_gray function calls occur only the first time an object is seen.
Subsequent stores find the object already marked GRAY and dont have to call the collector. Making a WHITE object GRAY isnt an expensive process, often taking less than ten instructions.
Once the garbage collector and the application are cooperating, its possible to run the garbage collector as a separate thread of execution. The garbage collectors priority may have to be set differently depending on the characteristics of the application. The priority can be set low if the application spends a lot of time waiting for external events.
In fact, applications that are I/O bound or event driven may have better performance than explicitly freeing objects because the garbage collector can run while the processor is not otherwise busy, and still keep up with the demands for new memory. However, thats not always the case.
If an application has a mix of event, I/O, and compute-bound processing threads, the priority of the garbage collector can be set lower than the event and I/O threads and at the same priority as the compute-bound threads. And, when all free memory is exhausted, the garbage collectors priority can change dynamically to take the priority of the thread that was trying to allocate memory.
The garbage collector can run until completion, and then the allocating thread can continue. Its also reasonable to have just two priorities for the garbage collector: one that is low for when free memory is plentiful and one that is higher for when free memory is exhausted or nearly exhausted.
Garbage collection isnt the only memory-management consideration in a real-time system. The allocation of memory must be fast and predictable. The system must allocate an object in nearly the same amount of time for every allocation.
Therefore, the memory manager cant maintain long linked lists of objects that must be searched every time memory is requested. The memory heap structure must be organized in a way that ensures predictability.
The Java language doesnt have pointers to physical memory or any other built-in method for accessing specific memory addresses. This limitation makes it difficult to write device drivers or any other code that needs to talk to physical devices entirely in Java. Additionally, there may be existing modules written in C++ that you want to keep.
For this reason, the Java language specifies that certain methods may be declared native. Originally, native methods enabled high-performance functions to be performed in the native instruction set of the host machine without incurring the performance penalty of the virtual machine.
Native methods can interface between Java and C++ in a compiled environment as well. The native declaration tells the compiler that the method is externally defined, so you can write this external method in C or C++.
Besides allowing native methods to be written in C or C++, the Java compiler has a switch (-xj for the Microtec Java compiler) that tells the compiler to emit two files for every class that has native methods. The first file is a C++ header file called Class.h that contains a C++ definition of the Java class. The second file is a Class.cpp file that contains a stub C++ program for every native method.
To implement the native methods, just edit the .cpp file and add code to the method stubs (see Photo 1).
![]() Photo 1This screenshot demonstrates implementing the native method by editing the .cpp file. |
Because accessing memory directly is such a common request, Microtec included a package of classes called COM.mentorg. microtec.Phys, which enables you to create objects that access memory directly. There are three classes, one for each size of memory.
These classes are PhysicalByte, PhysicalShort, and PhysicalInt, and each class contains the following methods:
Physicalsize(int address)
set(size value)
size get()
int getAddress()
setAddress()
size and (size value)
size or (size value)
The constructor for each of these objects takes an int argument thats the address in memory associated with this object. When a variable of type PhysicalInt, PhysicalByte, or PhysicalShort is declared, it takes one argument, which is the memory address at which you want the data to reside. For example, PhysicalInt myInt = new PhysicalInt(0x0100000C); creates a PhysicalInt variable and stores it at memory location 0x0100000C.
To set the data to something useful, use the set() method. For example, myInt.set(256); gives your myInt a value of 256.
Later, when you need to retrieve the data in myInt, use the get() method. int newInt = myInt.get(); will make newInt contain the value (e.g., 256) of the data in myInt. If you need to find out the address of a variable in memory, try something like int myAddress = myInt.getAddress();.
You can change the address of myInt in memory with the setAddress() method. For instance, myInt.setAddress(0x0100000A); changes the address from whatever it was before to 0x0100000A.
If you want to perform a bitwise and or or with the data, they work the same way:
newInt = myInt.or(31);
newInt = myInt.and(31);
The functions take an argument, the number to and or or, to the data already in place and return the result.
These classes can be subclassed and any of the methods may be overridden to add additional functionality. For example, the and and or methods may need to be noninterruptible so they can be overridden with a method that disables interrupts during their execution. Listing 2 shows how to use COM.mentorg.microtec.Phys.
Youve seen some of Javas advantages as a language for developing embedded systems, but youve also seen some of the nonobvious pitfalls of using a version of desktop Java in an embedded system.
Of the three options (special-purpose JVM, JVM with a JIT, or compiled Java), compiled Java probably has the best set of tradeoffs. Its clearly the winner in raw CPU speed, and it has memory usage similar to a conventional language like C or C++. Although compiled Java isnt portable and doesnt immediately offer the ability to load classes dynamically, for many embedded systems, these drawbacks arent issues.
If portability or dynamic class loading are needed, the next-best alternative is probably a special-purpose JVM.
Although a JIT compiler seems to offer the most straightforward way to add performance to Java, it runs the risk of taking up too much memory and causing unacceptable pauses while compiling a just-loaded class.
These alternatives resolve the problems of using Java in embedded systems. So, all the benefits of Java as a language and as a source for reusable code are available to both embedded and desktop application developers.
Vladimir Ivanovic is the senior marketing engineer at Mentor Graphics Microtec Division and specializes in Microtecs VRTX real-time operating and development systems and Java products. He has taught computer courses for more than 12 years at Northeastern University and at the University of California, Berkeley, extension. You may reach him at vladimir_ivanovic@mentorg.com.Mike Mahar has worked in the software development industry for more than 20 years and has been involved in developing compilers, assemblers, debuggers, and integrated development environments. He has also worked on software development tools and contributed to the design of several RISC processors. You may reach him at mike_mahar@mentorg.com.
K. Arnold and J. Gosling, The Java Programming Language, Addison-Wesley, Reading, MA, 1997.
R. Jones and R. Lins, Garbage Collection: Algorithms for Automatic Dynamic Memory Management, Wiley & Sons, New York, NY, 1996.
SOURCES Mentor Graphics Microtec DivisionHewlett-Packard
(800) 452-4844
(650) 857-1501
www.hpconnect.com/embeddedvm
Insignia Solutions
(800) 848-7677
(510) 360-3700
Fax: (510) 360-3701
www.insignia.com/embedded
NewMonics
(515) 296-0897
Fax: (515) 296-4595
www.newmonics.com
NSI Com
(212) 717-9615
Fax: (212) 734-4079
www.nsicom.com
EmbeddedJava, Java Native Interface, PersonalJava
Sun Microsystems
(650) 960-1300
www.java.sun.com/products/embeddedjava
www.java.sun.com/products/jdk/1.1/docs/guide/jni
www.java.sun.com/products/personaljava