If you’re set on putting some desktop-Java functionality into an embedded system, chances are that you’ve had some sleepless nights. No more! That’s what Vladimir and Mike promise if you’ll consider the available alternatives.


Although Java has properties that would be useful for embedded-system design, the versions of Java used in desktop systems just aren’t suitable for embedded systems. There are some alternatives, but they do have drawbacks.

When considering the alternatives, it’s important to consider issues like multithreading and debugging support. Regardless of whichever option emerges as the preferred form, two key issues must still be addressed by any embedded Java programming environment: how to provide determinism and how to interface to hardware.

JAVA IS GOOD

One of Java’s strengths is its reasonably clean syntax that is strongly reminiscent of C or C++. So although it’s a new language, it’s familiar. Getting up to speed with Java is easy.

More importantly, Java is both object-oriented and strongly typed. Everything in Java is an object and there are no loopholes to circumvent Java’s strong typing. Since the advent of C++, these features are considered essential in a programming language because they contribute enormously to the correctness of programs.

Anecdotal evidence bandied about in Java newsgroups and mailing lists suggests that developers take less time to produce a working Java program than a program in C or C++. Debugging is also easier because Java has removed a prolific source of hard-to-find bugs, including those related to the incorrect use of pointers.

Example bugs include memory leaks and memory access errors (wild pointers, referencing freed memory, returning a pointer to a local variable, etc.). Java doesn’t allow its pointer equivalent (i.e., object references) to be manipulated in the same way as pointers are in C or C++, and it provides automatic garbage collection.

Another strength of Java is its large reusable code base. In the standard distribution, Java supports threads, TCP/IP networking, and remote invocation. It even has a full set of classes for building GUIs.

Additional APIs support a variety of needs, such as database access, communication, multimedia, a way to use GUI components, and security.

With Java’s strengths as a language, a development environment, and a reusable code base, it’s easy to see why developers—and not just embedded-system developers—are eager to put it to use.

DESKTOP JAVA DRAWBACKS

Unfortunately, as I mentioned, desktop Java has some drawbacks when used in embedded systems. Although Java was originally intended for use in set-top boxes, it was first used in a web browser, which is a desktop application.

First, desktop Java is too big for embedded applications. Not only must the entire Java virtual machine (JVM) be present, but a Java interpreter or a just-in-time (JIT) compiler must be present as well.

On top of that, all the standard classes must be present. These take up to 8 MB on disk, more when loaded. Fonts take even more space.

The bottom line is that desktop Java needs on the order of 16 MB just to run, and the application needs are additional. Very few embedded systems have that kind of memory available.

Also, Java is too slow. Sun’s first releases were usually more than 30× slower than equivalent C code. Subsequent releases, which use JIT compilers, are significantly faster but still perhaps 5× slower than equivalent C. If you’re used to squeezing out the last few cycles out of a processor, this is a heavy penalty to pay just to use Java.

But, the most important drawback of desktop Java is that it doesn’t meet the constraints of most embedded systems. One such constraint is the requirement for real-time behavior (i.e., execution that’s both predictable and bounded in duration).

Many embedded systems have severe real-time requirements. For instance, the collision-detection system on a jetliner has seconds in which to respond. Computation must finish in a certain amount of time, so execution has to be predictable.

Another constraint of embedded systems is their limited resources. Consumer devices, which may be manufactured in the millions, are very sensitive to cost, so designers tend to use the smallest processor and the smallest amount of memory possible to do the job. A programming language that’s slow and uses up a lot of memory just isn’t competitive with existing alternatives.

More importantly, Java doesn’t possess the notion of an address. Embedded systems, almost by definition, are required to access hardware. Most often, that hardware is accessed by referring to a specific address. Because addresses aren’t part of Java, you have to go outside the language to overcome this constraint.

Finally, desktop Java has some attributes that get in the way of successful use in embedded systems. These attributes may be useful and even necessary in desktop systems, but not in embedded systems.

For instance, Java is interpreted (the source of much of its slowness) and it is dynamic because it supports the downloading of new classes on-the-fly. Java is portable across many different systems because its source code is compiled, not to native code, but to bytecodes, an architecture-neutral format. Also, Java supports a comprehensive security model designed to prevent many kinds of attacks.

However, for embedded systems, which frequently exist in completely closed environments, portability and security aren’t issues. Unless an embedded system is connected to a network, the ability to load new classes dynamically is useless.

These attributes of desktop Java prevent its use in embedded systems. And, the issues of performance, memory consumption, and poor real-time behavior make it hard to retarget the desktop version to an embedded system.

EMBEDDED ALTERNATIVES

What are the alternatives? How can an embedded-sytems developer use the great features of Java without quadrupling the system’s cost or writing piles of non-Java code?

Essentially, there are only three options: use a special-purpose JVM, use a JVM with a JIT compiler, or use compiled Java instead of some form of interpreted Java.

Many vendors have come up with specially tailored versions of Java that are a better fit for the needs of embedded developers. For instance, Sun offers PersonalJava for systems with 2–4 MB of memory and EmbeddedJava for smaller systems (Mentor Graphics’ Microtec Division is a licensee of PersonalJava). Hewlett-Packard, NSI Com, Insignia Solutions, NewMonics, and others have similar offerings.

Another approach, even in versions tailored for embedded use, is to use a dynamic compilation technique, typically a JIT compiler, to increase performance. But, there are several tradeoffs involved.

First, JVMs with JITs have potentially longer start-up times because the JIT compiler has to compile Java bytecodes into native machine language before executing. Secondly, it’s difficult to do a good job of optimizing native code while keeping memory consumption low. The more optimizations that are done, the larger and slower the JIT compiler becomes.

Several vendors provide knobs to tune the dynamic compilation process so you can choose on a case-by-case basis exactly what the performance, memory consumption, and start-up-time tradeoffs are going to be.

But, for some embedded applications, even a JVM with dynamic compilation is too slow and takes up too much memory. One option that is increasingly being considered is compiling Java directly to a native machine language, thereby eliminating both the JVM and either the interpreter or the JIT compiler.

Of course, the resulting application is no longer portable, but embedded developers typically don’t care about portability. For a given design, their application needs to run on a single well-known hardware configuration.

The other attribute that compiled Java forces a developer to give up is the ability to load new classes on-the-fly. Because all the code is precompiled, there’s no facility for dynamic loading of classes. Again, this issue probably isn’t too serious for embedded-system developers, most of whom don’t want random classes downloaded onto their system.

If you’re willing to tolerate the lack of portability and the lack of dynamic class loading, you can still reap all the benefits of Java as a great language and keep the system small and fast—that is, if you can resolve the issues of determinism and low-level programming.

RESOLVING THE ISSUES

Any version of Java for embedded systems must first be deterministic and predictable. It also has to be able to access memory directly.

One bugaboo of embedded systems is ensuring real-time response. In the case of Java-based systems, the primary cause of nondeterminism is the garbage collector.

In desktop systems, it doesn’t matter much that the JVM stops for several seconds to collect unused memory. But in an embedded system, several seconds can be the difference between correct operation and the loss of human life.

The biggest threat to an embedded operation is that most garbage collectors work in what’s called stop-the-world mode. Usually, the collector is called only when an allocation fails because memory is exhausted. Therefore, allocation time is impossible to predict, and when the collector is running, no other processing is being done. This situation is unacceptable in a real-time system.

An obvious solution is to have the garbage collector run concurrently with the application so that the impact of garbage collection is spread around more evenly. This way, time-critical events are processed in a timely manner.

Ensuring real-time response still isn’t enough to make Java useful for developing embedded systems. Because the added value of embedded systems is their specialized hardware, the embedded software must always be able access or control the hardware, which requires an extension to the Java through a Java Native Interface (JNI) with several possible options or through a nonstandard extension of the Java language.

Sun’s JNI permits portability across different JVMs on a particular processor architecture, but it suffers from poor efficiency. An earlier version of JNI was more efficient, but it required the Java code to know the layout of an object in memory. In any case, it makes sense for an embedded Java system to offer a package specifically for accessing physical memory.

REAL-TIME GARBAGE COLLECTION

The Java language requires garbage collection of unused objects and there’s no corresponding delete operator to go with the new operator. One advantage of garbage collection is that you can’t have bugs in your memory allocation if the responsibility for detecting unused memory and reallocating it is automatic. The application simply clears a reference to memory to make it available for future use.

However, the advantages of garbage collection come with a price. Finding unused memory and freeing it can take a long time, causing critical deadlines to be missed in a real-time environment. A garbage collector for a real-time system must be predictable and fast in addition to allowing high-priority threads to run. Unfortunately, most collectors fail on all three of these requirements.

Garbage collectors work oppositely of what their name implies. They find all the memory blocks that are in use and free up what’s left over. There are many different algorithms for garbage collection, but most of them share the following steps:

• scan the local and statically allocated variables for pointers to the heap

• mark each memory object that can be reached from these pointers

• scan each marked memory object for pointers to the heap

• repeat steps 2 and 3 until no new pointers are found

• sweep the heap and free up any memory that is not marked

MOVING POINTER PROBLEM

A major problem with concurrent garbage collection is that while the collector is scanning the heap for pointers, the application is changing those same pointers. In essence, the entire heap is a critical section.

Suppose an application is manipulating a linked list of three objects, as illustrated in Figure 1. Object A points to object B, which points to object C. The garbage collector scans object A for pointers but has yet to scan objects B or C. The application then deletes object B by copying the pointer to C to object A.

9810017-Fig_1.gif (5205 bytes)

Figure 1—Object C is lost if the application changes pointers between garbage-collection scans.

The collector hasn’t scanned object C, nor has it scanned object B, so it won’t find any pointers to object C because it already scanned object A. When the collector completes scanning, it frees object C even though there’s a live pointer to it in object A.

APPLICATION AND INTERFERENCE

Interference between the application and garbage collection is the only situation you have to worry about. Other accesses to the heap don’t affect garbage collection. To make concurrent garbage collection work, the application must tell the collector every time it writes an object pointer to another memory object. This is called a write barrier.

Write barriers sound expensive, but there are several ways to speed things up. Every allocated object on the heap has a flag word containing the state of that memory object. There are three possible states: black, gray, or white.

The collector knows a black-state object is live and has scanned it for pointers. The collector knows a gray-state object is live but has not yet scanned it for pointers. In the white state, the collector has not yet found a pointer to the object.

There are several variations on how a write barrier is implemented. Listing 1 is one example of a write barrier. The if statement is generated automatically by the compiler, so it’s not necessary to put in write barriers by hand.

object_a->next = object_c;

if(object_c != NULL)

if(object_c->garbage_flags == WHITE){

object_c->garbage_flags = GRAY;

gc_make_gray(object_c);

}

Listing 1—Since a compiler knows about every write to memory, it inserts a write barrier automatically.

Every pointer assignment to the heap has two additional tests. Usually, programs manipulate the same pointers multiple times. The garbage_flags and gc_make_gray function calls occur only the first time an object is seen.

Subsequent stores find the object already marked GRAY and don’t have to call the collector. Making a WHITE object GRAY isn’t an expensive process, often taking less than ten instructions.

Once the garbage collector and the application are cooperating, it’s possible to run the garbage collector as a separate thread of execution. The garbage collector’s priority may have to be set differently depending on the characteristics of the application. The priority can be set low if the application spends a lot of time waiting for external events.

In fact, applications that are I/O bound or event driven may have better performance than explicitly freeing objects because the garbage collector can run while the processor is not otherwise busy, and still keep up with the demands for new memory. However, that’s not always the case.

If an application has a mix of event, I/O, and compute-bound processing threads, the priority of the garbage collector can be set lower than the event and I/O threads and at the same priority as the compute-bound threads. And, when all free memory is exhausted, the garbage collector’s priority can change dynamically to take the priority of the thread that was trying to allocate memory.

The garbage collector can run until completion, and then the allocating thread can continue. It’s also reasonable to have just two priorities for the garbage collector: one that is low for when free memory is plentiful and one that is higher for when free memory is exhausted or nearly exhausted.

MEMORY ALLOCATION

Garbage collection isn’t the only memory-management consideration in a real-time system. The allocation of memory must be fast and predictable. The system must allocate an object in nearly the same amount of time for every allocation.

Therefore, the memory manager can’t maintain long linked lists of objects that must be searched every time memory is requested. The memory heap structure must be organized in a way that ensures predictability.

INTERFACING HARDWARE

The Java language doesn’t have pointers to physical memory or any other built-in method for accessing specific memory addresses. This limitation makes it difficult to write device drivers or any other code that needs to talk to physical devices entirely in Java. Additionally, there may be existing modules written in C++ that you want to keep.

For this reason, the Java language specifies that certain methods may be declared native. Originally, native methods enabled high-performance functions to be performed in the native instruction set of the host machine without incurring the performance penalty of the virtual machine.

Native methods can interface between Java and C++ in a compiled environment as well. The native declaration tells the compiler that the method is externally defined, so you can write this external method in C or C++.

Besides allowing native methods to be written in C or C++, the Java compiler has a switch (-xj for the Microtec Java compiler) that tells the compiler to emit two files for every class that has native methods. The first file is a C++ header file called Class.h that contains a C++ definition of the Java class. The second file is a Class.cpp file that contains a stub C++ program for every native method.

To implement the native methods, just edit the .cpp file and add code to the method stubs (see Photo 1).

9810017_photo_1.jpg (102304 bytes)

Photo 1—This screenshot demonstrates implementing the native method by editing the .cpp file.

PHYSICAL MEMORY PACKAGE

Because accessing memory directly is such a common request, Microtec included a package of classes called COM.mentorg. microtec.Phys, which enables you to create objects that access memory directly. There are three classes, one for each size of memory.

These classes are PhysicalByte, PhysicalShort, and PhysicalInt, and each class contains the following methods:

• Physicalsize(int address)

• set(size value)

• size get()

• int getAddress()

• setAddress()

• size and (size value)

• size or (size value)

The constructor for each of these objects takes an int argument that’s the address in memory associated with this object. When a variable of type PhysicalInt, PhysicalByte, or PhysicalShort is declared, it takes one argument, which is the memory address at which you want the data to reside. For example, PhysicalInt myInt = new PhysicalInt(0x0100000C); creates a PhysicalInt variable and stores it at memory location 0x0100000C.

To set the data to something useful, use the set() method. For example, myInt.set(256); gives your myInt a value of 256.

Later, when you need to retrieve the data in myInt, use the get() method. int newInt = myInt.get(); will make newInt contain the value (e.g., 256) of the data in myInt. If you need to find out the address of a variable in memory, try something like int myAddress = myInt.getAddress();.

You can change the address of myInt in memory with the setAddress() method. For instance, myInt.setAddress(0x0100000A); changes the address from whatever it was before to 0x0100000A.

If you want to perform a bitwise and or or with the data, they work the same way:

newInt = myInt.or(31);

newInt = myInt.and(31);

The functions take an argument, the number to and or or, to the data already in place and return the result.

These classes can be subclassed and any of the methods may be overridden to add additional functionality. For example, the and and or methods may need to be noninterruptible so they can be overridden with a method that disables interrupts during their execution. Listing 2 shows how to use COM.mentorg.microtec.Phys.

RECOMMENDATIONS

You’ve seen some of Java’s advantages as a language for developing embedded systems, but you’ve also seen some of the nonobvious pitfalls of using a version of desktop Java in an embedded system.

Of the three options (special-purpose JVM, JVM with a JIT, or compiled Java), compiled Java probably has the best set of tradeoffs. It’s clearly the winner in raw CPU speed, and it has memory usage similar to a conventional language like C or C++. Although compiled Java isn’t portable and doesn’t immediately offer the ability to load classes dynamically, for many embedded systems, these drawbacks aren’t issues.

If portability or dynamic class loading are needed, the next-best alternative is probably a special-purpose JVM.

Although a JIT compiler seems to offer the most straightforward way to add performance to Java, it runs the risk of taking up too much memory and causing unacceptable pauses while compiling a just-loaded class.

These alternatives resolve the problems of using Java in embedded systems. So, all the benefits of Java as a language and as a source for reusable code are available to both embedded and desktop application developers.

Vladimir Ivanovic is the senior marketing engineer at Mentor Graphics’ Microtec Division and specializes in Microtec’s VRTX real-time operating and development systems and Java products. He has taught computer courses for more than 12 years at Northeastern University and at the University of California, Berkeley, extension. You may reach him at vladimir_ivanovic@mentorg.com.

Mike Mahar has worked in the software development industry for more than 20 years and has been involved in developing compilers, assemblers, debuggers, and integrated development environments. He has also worked on software development tools and contributed to the design of several RISC processors. You may reach him at mike_mahar@mentorg.com.


REFERENCES

K. Arnold and J. Gosling, The Java Programming Language, Addison-Wesley, Reading, MA, 1997.

R. Jones and R. Lins, Garbage Collection: Algorithms for Automatic Dynamic Memory Management, Wiley & Sons, New York, NY, 1996.

SOURCES

Mentor Graphics Microtec Division
(800) 950 5554
(408) 487-7000
Fax: (408) 487-7001
www.mentorg.com/microtec/java

Hewlett-Packard
(800) 452-4844
(650) 857-1501
www.hpconnect.com/embeddedvm

Insignia Solutions
(800) 848-7677
(510) 360-3700
Fax: (510) 360-3701
www.insignia.com/embedded

NewMonics
(515) 296-0897
Fax: (515) 296-4595
www.newmonics.com

NSI Com
(212) 717-9615
Fax: (212) 734-4079
www.nsicom.com

EmbeddedJava, Java Native Interface, PersonalJava
Sun Microsystems
(650) 960-1300
www.java.sun.com/products/embeddedjava
www.java.sun.com/products/jdk/1.1/docs/guide/jni
www.java.sun.com/products/personaljava