2

I am trying to alter a little bit the LLVM Backend for X86 target, to produce some desired behaviour.

More specifically, I would like to emulate a flag like gcc's fcall-used-reg option, which instructs the compiler to convert a callee-saved register into a clobbered register (meaning that it may be altered during a function call).

Let's focus on r14. I manually clobber the register, like in this answer:

#include <inttypes.h>

uint64_t inc(uint64_t i) {
    __asm__ __volatile__(
        ""
        : "+m" (i)
        :
        : "r14"
    );
    return i + 1;
}

int main(int argc, char **argv) {
    (void)argv;
    return inc(argc);
}

Compile and disassemble:

 gcc -std=gnu99 -O3 -ggdb3 -Wall -Wextra -pedantic -o main.out main.c
 objdump -d main.out

Disassembly contains:

0000000000001150 <inc>:                                                                                                                                                                                            
    1150:       41 56                   push   %r14                                                                                                                                                                
    1152:       48 89 7c 24 f8          mov    %rdi,-0x8(%rsp)                                                                                                                                                     
    1157:       48 8b 44 24 f8          mov    -0x8(%rsp),%rax                                                                                                                                                     
    115c:       41 5e                   pop    %r14                                                                                                                                                                
    115e:       48 83 c0 01             add    $0x1,%rax                                                                                                                                                           
    1162:       c3                      retq                                                                                                                                                                       
    1163:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)                                                                                                                                                
    116a:       00 00 00
    116d:       0f 1f 00                nopl   (%rax)

where we can see that r14, because it is tampered with, is pushed to the stack, and then popped to regain its original value. Now, repeat with the -fcall-used-r14 flag:

 gcc -std=gnu99 -O3 -ggdb3 -fcall-used-r14 -Wall -Wextra -pedantic -o main.out main.c
 objdump -d main.out

Disassembly contains:

0000000000001150 <inc>:                                                                                                                                                                                            
    1150:       48 89 7c 24 f8          mov    %rdi,-0x8(%rsp)
    1155:       48 8b 44 24 f8          mov    -0x8(%rsp),%rax
    115a:       48 83 c0 01             add    $0x1,%rax
    115e:       c3                      retq
    115f:       90                      nop

where no push/pop happens.

Now, I have modified some LLVM Target files, compiled the source, and added(?) this functionality to the llc tool:

clang-11 -emit-llvm -S -c main.c -o main.ll
llc-11 main.ll -o main.s

Now, main.s contains:

# %bb.0:
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset %rbp, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register %rbp
        pushq   %r14
        .cfi_offset %r14, -24
        movq    %rdi, -16(%rbp)
        #APP
        #NO_APP
        movq    -16(%rbp), %rax
        addq    $1, %rax
        popq    %r14
        popq    %rbp
        .cfi_def_cfa %rsp, 8
        retq

Apparently, r14 is still callee-saved.

Inside llvm/lib/Target/X86/X86CallingConv.td I have modified the following lines (removing R14), because they seemed the only relevant to the System V ABI for Linux and C calling conventions that I was interested in:

def CSR_64 : CalleeSavedRegs<(add R12, R13, R15, RBP)>;
...
def CSR_64_MostRegs : CalleeSavedRegs<(add RBX, RCX, RDX, RSI, RDI, R8, R9, R10,
                                           R11, R12, R13, R15, RBP,
...
def CSR_64_AllRegs_NoSSE : CalleeSavedRegs<(add RAX, RBX, RCX, RDX, RSI, RDI, R8, R9,
                                                R10, R11, R12, R13, R15, RBP)>;

My questions are:

  • Is X86CallingConv.td the only file I should modify? I think yes, but maybe I'm wrong.
  • Am I focusing on the correct lines? Maybe this is more difficult to answer, but at least a direction could be helpful.

I am running LLVM 11 inside Debian 10.5.

EDIT:

Changing the line, removing R14 from "hidden" definition:

def CSR_SysV64_RegCall_NoSSE : CalleeSavedRegs<(add RBX, RBP, RSP,
                                               (sequence "R%u", 12, 13), R15)>;

as Margaret correctly pointed out did not help either.

  • 1
    I notice you didn't enable optimization for clang. Does that make any difference? Probably not. Also, why force `i` as a *memory* operand? `"+r"(i)` would work fine too. (You can see what it picked by doing `"nop # operand = %0"` for the template if you want. The nop is only there so tools that filter asm comments won't remove the line; otherwise a pure asm comment is fine.) – Peter Cordes Oct 30 '20 at 01:20
  • 2
    What about `CSR_SysV64_RegCall_NoSSE` (which has an "hidden" `r14` in it), does it make any difference? As a smoke test you could try removing R14 from *all* the calling conventions (beware of "hidden" definitions) and then pinpoint the right one. – Margaret Bloom Oct 30 '20 at 08:26
  • @PeterCordes No, the optimization flag did not help. Thanks for your input on inline assembly. @MargaretBloom The "hidden" definition was a clever observation indeed, and I did miss that! `CSR_SysV64_RegCall_NoSSE` didn't make a difference, but I'll keep looking. Also, I can remove all R14 occurrences in a "brute-force" way, although I was seeking a more intuitive approach. This will be my last resort. – Nikos Mavrogeorgis Oct 30 '20 at 13:47

1 Answers1

4

Turns out, the minimum modification was the line:

def CSR_64 : CalleeSavedRegs<(add RBX, R12, R13, R15, RBP)>;

The problem was with how I built the source.

By running cmake --build . again after the original installation, the llc tool was not modified globally (I thought it would have because I was building the default architecture - X86 - but that was irrelevant). So, I was calling an unmodified llc-11 tool. Thus, when I ran:

/path/to/llvm-project/build/bin/lcc main.ll -o main.s

main.s contained:

# %bb.0:
        movq    %rdi, -8(%rsp)
        #APP
        #NO_APP
        movq    -8(%rsp), %rax
        addq    $1, %rax
        retq

which is what I wanted in the first place.

  • 1
    Oh, yeah, `make install` or whatever is always separate from just rebuilding the source. Replacing `/usr/bin/whatever` would require root, and affect all users, unlike just writing new files in the build directory. Even if you have the only user account on your system, Unix-like systems always work this way. You wouldn't want that to be the default anyway; if you're working on clang/LLVM and introduce a bug, you'd break your system if that replaced the system-wide compiler and couldn't rebuild after fixing the bug. – Peter Cordes Oct 30 '20 at 21:37
  • @PeterCordes Yes, that was an unforced error, I guess. – Nikos Mavrogeorgis Oct 31 '20 at 14:12