July
2004, Issue 168
Test Your
EQ:
|
Answer
5Short
of actually simulating the full execution of the code,
an intermediate approach works well. The disassembler
keeps a table of “entry points” to executable code. Disassembly
starts out by seeding this table with the known hardware
entry points of the processor in question (reset and interrupt
vectors). As the disassembly process proceeds, it adds
additional entries every time it encounters any sort of
branch or call instruction. Whenever it gets to an unconditional
jump or return, it abandons the current sequence of instructions
and starts on another unexplored entry point from the
table. Eventually, it finds all of the reachable code
in the system.
One
nice side effect of this is that the output listing can
contain generated labels for all of the entry points.
Granted, these will be generic labels of the form “L0001:,”
but it’s easy to replace these with meaningful names after
you figure out what the code actually does.
The
only things this approach can’t handle are any sort of
jumps or calls in which the destination address is computed
or looked up in a table. These tend to be rare enough
that it’s easy to identify them and add the necessary
entries to the entry point table manually.
Contributor:
David Tweed