In real life code where the constructor is another compilation unit the compiler will not be able to see the body of the constructor (hence the noinline attribute). It is also not final to mimic some real world requirements.
To de-virtualize, the compiler generally needs to be able to prove that the class hierarchy is sealed. If the calls to the constructor are in separate translation units, the compiler can't prove it. However, using link-time optimization can give the optimizer information across translation units, which can make it easier to prove facts about class hierarchies and references.
Here's an example using clang.
b.hpp
#ifndef B_H
#define B_H
struct B {
virtual int foo();
};
#endif
b.cpp
#include "b.h"
int B::foo() { return 3; };
c.hpp
#ifndef C_H
#define C_H
#include "b.h"
struct C {
B& b;
C(B& b);
int foo();
};
#endif
c.cpp
#include "c.h"
C::C(B& b) : b(b) {}
int C::foo() {
return b.foo();
}
main.cpp
#include <iostream>
#include "b.h"
#include "c.h"
int main(const int argc, const char* argv[argc]) {
B b;
C c(b);
std::cout << c.foo() << std::endl;
return 0;
}
Since the optimizer knows nothing about the call sites for C::C
(the
constructor) it knows nothing about the runtime type of B
. So, it can't
de-virtualize B::foo
.
C::foo
_ZN1C3fooEv: # @_ZN1C3fooEv
.cfi_startproc
# BB#0:
movq (%rdi), %rdi
movq (%rdi), %rax
jmpq *(%rax) # TAILCALL <== pointer call
However, giving the optimizer link-time
information (-flto
) allows it to prove that the class hierarchy is sealed from the call sites.
B::foo
0000000000400960 <_ZN1B3fooEv>:
400960: b8 03 00 00 00 mov $0x3,%eax
400965: c3 retq
400966: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
40096d: 00 00 00
main
0000000000400970 <main>:
400970: 41 56 push %r14
400972: 53 push %rbx
400973: 50 push %rax
400974: 48 c7 04 24 78 0a 40 movq $0x400a78,(%rsp)
40097b: 00
40097c: 48 8d 3c 24 lea (%rsp),%rdi
400980: e8 db ff ff ff callq 400960 <_ZN1B3fooEv> # <== direct call