1

I was looking at the gdb disassembly of a program

...
0x8048085: cmp    eax,ecx
0x8048087: je     0x804809f
0x8048089: mov    DWORD PTR [esp+0x4],0x21
0x8048091: mov    DWORD PTR [esp],0x8048160
0x8048098: jmp    0x8048157
0x804809d: mov    esi,0x115e8ba
0x80480a2: add    BYTE PTR [eax],al
...

In the 2nd instruction, there is a jump to address 0x...9f which is in between the two instructions later on. I understand that everything is theoretically possible as the processor should not care as long as we feed him with an address of what to execute, but still... If someone could explain, thanks

Update : ok it seems there is a problem with the addresses. But this is a snippet part of a bigger code (200 lines). I looked at where it said "call 0x..." and replaced them with labels and the code looks like :

func1:
   ...
   asm
   ...
   call func2
   ...
   ret
func2:
   ...
   asm
   ...
   ret
...

So as much I would like to buy the fact that the disassembly was tripped up in the addresses at some point, it doesn't correlate with the fact that everywhere it says call 0x..., the instruction preceding the 0x... is a "ret". If addresses had an offset somewhere this would not be true

Thomas
  • 8,306
  • 8
  • 53
  • 92
  • 2
    The instruction at address A2 is junk as well. Your disassembler is just being tripped up by the two extra junk bytes after the JMP. Why they are there is unguessable, padding bytes are definitely not unusual but not for an unaligned branch target address. – Hans Passant Oct 27 '13 at 15:20

3 Answers3

3

My first guess is that there is some padding data inserted at 0x804809d. Meaning that the disassembly for that section should have started at 0x804809f.

Based on the addresses listed, it is also possible that the disassembly got some offset wrong earlier in the process.

TAS
  • 2,039
  • 12
  • 17
  • what do you mean by padding data ? – Thomas Oct 28 '13 at 12:39
  • To ensure that jumps are as fast as possible, the compiler may insert padding data to ensure that all jump targets are at addresses that are a multiple of 8. – TAS Oct 28 '13 at 19:41
1

Yes, it is a possible thing, though it's worth mentioning the alignment issue here: Why should code be aligned to even-address boundaries on x86?

I see several possibilities of why it could have been done:

  • self-modifying code (when the actual meaningful instructions are going to be written to that address after the start)
  • interpreting one instruction's data as an opcode ( something like this )
  • simply wrong jump address (might even be a compiler error, but not necessary)
Community
  • 1
  • 1
vines
  • 5,160
  • 1
  • 27
  • 49
  • I don't know if it answers my question (and I don't know if it has an answer) but some hints and links here are very good, just wanted to +1 those – Thomas Oct 28 '13 at 19:36
0

Disassembly of variable word length instruction sets is tricky my guess is you are not looking at real code or the disassembler is having troubles.

old_timer
  • 69,149
  • 8
  • 89
  • 168
  • are there some better tools than gdb -> disas command ? I used objdump on my program and I think I got the same thing. Why is disassembly so complicated anyway ? There has to be just one way to process some code no ? – Thomas Oct 28 '13 at 19:26
  • there is a difference between executing code and disassembling. to properly disassemble variable word length you would follow the code in execution order, but that can still fail (long story). Also the way gdb/objdump work has to do with what tools you used to create this file, you can trip those up as well. It is an extremely difficult problem to solve esp with x86 which is why the best dissassemblers are far from perfect. – old_timer Oct 28 '13 at 20:16