2

For the purposes of storing and restoring a register, which one is faster?

Case 1:

mov  r9, rdi
mov  rdi, 10
call someFunction
mov  rdi, r9

Case 2:

push rdi
mov  rdi, 10
call someFunction
pop  rdi
avighnac
  • 376
  • 4
  • 12
  • 4
    Faster but you need a spare register. – Margaret Bloom Oct 08 '22 at 11:54
  • 3
    Register-register moves are definitely cheaper than register-memory moves. See also https://stackoverflow.com/questions/44169342/can-x86s-mov-really-be-free-why-cant-i-reproduce-this-at-all – Homer512 Oct 08 '22 at 12:01
  • 1
    The call itself will probably take more time. You should consider inlining. – Sebastian Oct 08 '22 at 12:40
  • Inlining as in inline assembly? – avighnac Oct 08 '22 at 12:43
  • 2
    Inlining in this context means that a function's body of code, or a relevant part of it, is inserted at the function call site instead of a `call` instruction and all the setup needed for it. – ecm Oct 08 '22 at 14:42
  • 2
    Inlining like what an optimizing compiler will do with a C function like `inline int foo(int x){ return x+1; }` - not actually `call` anything, just move that logic into the caller and optimize the whole thing. If `someFunction` is short, it's good to do that by hand in asm. – Peter Cordes Oct 08 '22 at 15:03
  • Or change the function to an assembler macro. – Sebastian Oct 08 '22 at 15:06
  • 3
    If you have a helper function that you know doesn't touch `r9`, even though it's call-clobbered in the standard calling convention, that's perfect, 2x `mov reg,reg` is cheaper than `push`+`pop`. Normally you'd need to pick a call-preserved register (like R12-R15, or RBX or RBP), which you'd push/pop at the start/end of your function. That's definitely good if the call is inside a loop, or you have multiple calls in a function. But if you have a tiny function that makes only one call, see [Why do compilers insist on using a callee-saved register here?](https://stackoverflow.com/q/61375336) – Peter Cordes Oct 08 '22 at 15:11

0 Answers0