I've been trying to get re-familiarized multi-threading recently and found this paper. One of the examples says to be careful when using code like this:
int my_counter = counter; // Read global
int (* my_func) (int);
if (my_counter > my_old_counter) {
... // Consume data
my_func = ...;
... // Do some more consumer work
}
... // Do some other work
if (my_counter > my_old_counter) {
... my_func(...) ...
}
Stating that:
If the compiler decides that it needs to spill the register containing my counter between the two tests, it may well decide to avoid storing the value (it’s just a copy of counter, after all), and to instead simply re-read the value of counter for the second comparison involving my counter[...]
Doing this would turn the code into:
int my_counter = counter; // Read global
int (* my_func) (int);
if (my_counter > my_old_counter) {
... // Consume data
my_func = ...;
... // Do some more consumer work
}
... // Do some other work
my_counter = counter; // Reread global!
if (my_counter > my_old_counter) {
... my_func(...) ...
}
I, however, am skeptical about this. I don't understand why the compiler is allowed to do this, since to my understanding a data race only occurs when trying to access the same memory area with any number of reads and at least a write at the same time. The author goes on to motivate that:
the core problem arises from the compiler taking advantage of the assumption that variable values cannot asynchronously change without an explicit assignment
It seems to me that the condition is respected in this case, as the local variable my_counter is never accessed twice and cannot be accessed by other threads. How would the compiler know that the global variable can not be set elsewhere, in another translation unit by another thread? It cannot, and in fact, I assume that the second if case would just be actually optimized away.
Is the author wrong, or am I missing something?