-1

I am trying to find the meaning of assembly code generated from a c program. Here is the program in C:

int* a = &argc;
int b = 8;
a = &b;

Here is the assembly code generated with explanations. There is one part that I do not understand:

Prologue of the main:

leal    4(%esp), %ecx
andl    $-16, %esp
pushl   -4(%ecx)
pushl   %ebp
movl    %esp, %ebp
pushl   %ecx
subl    $36, %esp

Load the address of argc in %eax:

movl    %ecx, %eax

The part I do not get:

movl    4(%eax), %edx
movl    %edx, -28(%ebp)

Stack-Smashing Protector code (setup):

movl    %gs:20, %ecx
movl    %ecx, -12(%ebp)
xorl    %ecx, %ecx

Load values in a and b (see in main.c):

movl    %eax, -16(%ebp)
movl    $8, -20(%ebp)

Modify the value of a (a = &b):

leal    -20(%ebp), %eax
movl    %eax, -16(%ebp)

Stack-Smashing Protector code (verify the stack is ok):

movl    $0, %eax
movl    -12(%ebp), %edx
xorl    %gs:20, %edx
je  .L7
call    __stack_chk_fail

If the stack is Ok:

.L7:
    addl    $36, %esp
    popl    %ecx
    popl    %ebp
    leal    -4(%ecx), %esp
    ret

So the part I do not uinderstand is modifying the value in -28(%ebp), an address never used. Does someone knows why is this part generated?

DJ_Joe
  • 161
  • 1
  • 1
  • 5
  • 6
    If you compile without optimizations (the default) then the compiler output will contain all the loads/stores to memory even if they aren't needed. Take your code and put it in a function that isn't `main`, and compile with optimizations on (`-O3`) – Michael Petch Nov 21 '17 at 19:55
  • dont use main(), make your code under test some other function name. the main() question has been asked and answered a zillion times... – old_timer Nov 21 '17 at 19:57
  • 3
    You told the compiler to turn off its brain by not specifying an optimization flag and now you wonder that it generates strange code? – fuz Nov 21 '17 at 20:00
  • If I optimize with -O1, the only thing that remains in the main is mov $0, %eax and then ret – DJ_Joe Nov 21 '17 at 20:01
  • 1
    @DJ_Joe And damn right so. Use `volatile` if you want to force the compiler to emit useless code. Or write test cases that actually do some meaningful work the compiler cannot simply remove. – fuz Nov 21 '17 at 20:03
  • @old_timer I would normally agree, but since OP specifically use the variable `argc` it makes sense. – klutt Nov 21 '17 at 20:20
  • I guess I am reading the question a different way, peace... – old_timer Nov 21 '17 at 20:38

1 Answers1

6

The good way to see what the compiler does. I assume you have a file called main.c:

int main(int argc, char **argv) 
{
    int* a = &argc;
    int b = 8;
    a = &b;
}

Compile with debug info to an object file:

$ gcc -c -g main.c

View the assembly:

$ objdump -S main.o

main.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <main>:
int main(int argc, char **argv)
{
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   89 7d ec                mov    %edi,-0x14(%rbp)
   7:   48 89 75 e0             mov    %rsi,-0x20(%rbp)
    int* a = &argc;
   b:   48 8d 45 ec             lea    -0x14(%rbp),%rax
   f:   48 89 45 f8             mov    %rax,-0x8(%rbp)
    int b = 8;
  13:   c7 45 f4 08 00 00 00    movl   $0x8,-0xc(%rbp)
    a = &b;
  1a:   48 8d 45 f4             lea    -0xc(%rbp),%rax
  1e:   48 89 45 f8             mov    %rax,-0x8(%rbp)
  22:   b8 00 00 00 00          mov    $0x0,%eax
}
  27:   5d                      pop    %rbp
  28:   c3                      retq   

Then do the same with full optimization:

$ gcc -c -g -O3 main.c 

And view the assembly again:

$ objdump -S main.o

main.o:     file format elf64-x86-64


Disassembly of section .text.startup:

0000000000000000 <main>:
int main(int argc, char **argv)
{
    int* a = &argc;
    int b = 8;
    a = &b;
}
   0:   31 c0                   xor    %eax,%eax
   2:   c3                      retq   

So the answer is yes. The compiler can produce instructions not needed. That's why you turn on optimizations. When they are turned off, the compiler does its job in a very generic way without thinking at all. For example, it reserves space for variables that are not used.

klutt
  • 30,332
  • 17
  • 55
  • 95
  • Ok thank you. What I found quite surprising is that the assembly part did not correspond to any part of the C source code. But it's true that since the program does nothing, every line of the assembly should be considered as useless – DJ_Joe Nov 21 '17 at 20:23
  • 1
    There is an art to crafting C functions to compile and disassemble to see what the compiler is doing. low optimization levels the compiler saves everything to stack/memory and retrieves form there and then if you have dead code it may write something that it never reads that is expected but that is in part because your program might have done that. writing test functions that are not dead code or optimized into something not interesting takes some practice. – old_timer Nov 21 '17 at 20:41
  • 1
    @DJ_Joe: see https://stackoverflow.com/questions/38552116/how-to-remove-noise-from-gcc-clang-assembly-output, especially the link to [Matt Godbolt's CppCon2017 talk: “What Has My Compiler Done for Me Lately? Unbolting the Compiler's Lid”](https://youtu.be/bSkpMdDe4g4), it's exactly what @ old_time was talking about. – Peter Cordes Nov 22 '17 at 00:40
  • Sadly, the answer to the title question of OP is "yes" even in optimized code, with complex-enough source the compiler will eventually have to stop with optimization calculations, and emit what is reasonably pruned correct code doing what it should. To create **optimal** code would take way too much time, with any large source base of modern SW projects it would take probably either straight beyond lifetime of universe, or at least thousands of years. To get near-optimal solution is something completely different (NP vs P problems), and current compilers are very good at this. – Ped7g Nov 22 '17 at 04:35
  • 1
    @Ped7g It is even impossible to create optimal code. Mainly because we don't have a good definition of "optimal". It it smallest code? Fastest code? Smallest memory consumption? What does "fastest"? Fastest response time or throughput? – klutt Nov 22 '17 at 07:51