2

I have a simple C program

int main() {
  int var;
  for(int i=0;i<3;i++) {
    var=1;
  }
  return 0;
}

In kernel module, I hook (using ftrace) [do_user_addr_fault][1] function which get struct pt_regs *regs as one of the parameters

Now using regs->cs and regs->ip, I could uniquely identify an instruction, but in case of loop both values for var=1 would be same. I thought regs->cx would help as I read that it stores the loop counter, but its values seems ad-hoc. I use pr_info("cs %lu ip %lu cx %lu\n", regs->cs, regs->ip, regs->cx); to print the values.

So how could I differentiate between the same instruction in a loop?

this is the assembly code I got from g++ -S prog.cpp

.file   "prog.cpp"
    .text
    .globl  main
    .type   main, @function
main:
.LFB0:
    .cfi_startproc
    endbr64
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    movl    $0, -8(%rbp)
.L3:
    cmpl    $2, -8(%rbp)
    jg  .L2
    movl    $1, -4(%rbp)
    addl    $1, -8(%rbp)
    jmp .L3
.L2:
    movl    $0, %eax
    popq    %rbp
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE0:
    .size   main, .-main
    .ident  "GCC: (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0"
    .section    .note.GNU-stack,"",@progbits
    .section    .note.gnu.property,"a"
    .align 8
    .long    1f - 0f
    .long    4f - 1f
    .long    5
0:
    .string  "GNU"
1:
    .align 8
    .long    0xc0000002
    .long    3f - 2f
2:
    .long    0x3
3:
    .align 8
4:

ref [1]: https://elixir.bootlin.com/linux/latest/source/arch/x86/mm/fault.c#L1220

ruke
  • 55
  • 5
  • 1
    "I thought regs->cx would help as I read that it stores the loop counter": Did you read it in a book from the early 1980s? Because that's about the most recent time that people treated `cx` as special in that regard. Since the 386 the general purpose registers are interchangeable for 95% of instructions, and compilers use them arbitrarily. – Nate Eldredge May 04 '22 at 06:23
  • 1
    Yes, the `loop` instruction is hardcoded to use `cx/ecx/rcx`, and it still exists, but it is [slow](https://stackoverflow.com/questions/35742570/why-is-the-loop-instruction-slow-couldnt-intel-have-implemented-it-efficiently) and nobody uses it, certainly not compilers. – Nate Eldredge May 04 '22 at 06:24
  • Unless the compiler unrolls the loop, it *is* the same machine instruction at the same address, executing multiple times. That's what a loop is. I guess you're trying to trace execution through multiple iterations, so you can enumerate different executions of the same instruction? – Peter Cordes May 04 '22 at 06:24
  • There's no reason to expect a compiler to pick RCX as the loop counter, though; as your [anti-optimized debug build](https://stackoverflow.com/questions/53366394/why-does-clang-produce-inefficient-asm-with-o0-for-this-simple-floating-point) shows, it kept `int i` in memory at `-8(%rbp)`, not a register at all. (It would have picked one if you'd used `register int i` in a debug build, and/or used optimization and used `volatile var` to stop the loop optimizing away.) Use `gcc -fverbose-asm` to have GCC comment the asm, especially debug builds. – Peter Cordes May 04 '22 at 06:26
  • 1
    If you insist on writing the user program in C, then there isn't really going to be any better way than compiling your code, reading the assembly, figuring out where the loop counter lives, and setting your handler to look at that register or memory location. And then redo it every time you upgrade your compiler, change compilation options, etc. Assuming the compiler doesn't notice that the loop is completely useless and can be deleted altogether, which if you enable optimizations it surely will. – Nate Eldredge May 04 '22 at 06:26
  • If you want to be sure that an exception handler can deduce the exact state of your program by inspecting its registers, then to do it reliably, I don't see much alternative to actually writing that section of the program in assembly. That's the only way you can really be sure what instructions will be executed and what data is in which registers. – Nate Eldredge May 04 '22 at 06:28

0 Answers0