26

Does the following piece of code constitute undefined behaviour, since I am jumping before the variable declaration and using it via a pointer? If so, are there differences between the standards?

int main() {
  int *p = 0;
label1: 
  if (p) {
    printf("%d\n", *p);
    return 0;
  }
  int i = 999;
  p = &i;
  goto label1;
  return -1;
}
Ed I
  • 7,008
  • 3
  • 41
  • 50
Pascal Kesseli
  • 1,620
  • 1
  • 21
  • 37
  • 1
    No undefined behavior for me. – ouah Jul 28 '14 at 22:40
  • 1
    possible duplicate of [Hoisting/Reordering in C, C++ and Java: Must variable declarations always be on top in a context?](http://stackoverflow.com/questions/22548148/hoisting-reordering-in-c-c-and-java-must-variable-declarations-always-be-on) –  Jul 28 '14 at 22:41
  • 4
    @ouah Your phrasing makes me wonder how you arrived at that conclusion. –  Jul 28 '14 at 22:42
  • 9
    @MikeW that's unrelated. The crucial part of this question is the line `p = &i`, and whether `p` can be read in the `printf()` line – Kijewski Jul 28 '14 at 22:42
  • 1
    Looks fine to me. Note that later compilers _may_ hoist the declaration to the top of the function so your code might not compile to this structure anyway. –  Jul 28 '14 at 22:43

3 Answers3

17

There is no undefined behavior in your program.

goto statement has two constraints:

(c11, 6.8.6.1p1) "The identifier in a goto statement shall name a label located somewhere in the enclosing function. A goto statement shall not jump from outside the scope of an identifier having a variably modified type to inside the scope of that identifier."

that you are not violating and there is no other shall requirements outside constraints.

Note that it is the same (in the sense there are no extra requirements) in c99 and c90. Of course in c90, the program would not be valid because of the mix of declaration and statements.

Regarding the lifetime of i object when accessed after the goto statement, C says (see my emphasis, the other copied sentences in the following paragraph would be interesting for a more tricky program):

(c11, 6.2.4p6) "For such an object that does not have a variable length array type, its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way. [...] If the block is entered recursively, a new instance of the object is created each time. [...] If an initialization is specified for the object, it is performed each time the declaration or compound literal is reached in the execution of the block; otherwise, the value becomes indeterminate each time the declaration is reached."

That means, i is still alive when *p is read; no object is accessed outside its lifetime.

ouah
  • 142,963
  • 15
  • 272
  • 331
  • You miss the point: can the variable `i` that `p` now points to still be read? The question is equivalent to whether the compiler has to hoist variable definitions. – Kijewski Jul 28 '14 at 22:44
  • @Kay I perfectly understand the question and as put in my answer I see no reason `p` should point elsewhere than at `i` object. – ouah Jul 28 '14 at 22:46
  • 5
    @Kay the lifetime of `i` ends at `}` or return of the function. – ouah Jul 28 '14 at 22:48
  • The second quote makes it unclear what the value of `i` should be when it is accessed after the opening `{` of its block, but before the line `int i = 999;`. Also it seems unclear whether using `goto` is "entering recursively". I would say that it isn't , and the 6.2.4/6 quote is talking about, for example, a function being called recursively. – M.M Jul 28 '14 at 23:12
  • 2
    @MattMcNabb The New C Standard by Derek M. Jones also says of this quote that *"A jump back to the recursive beginning of the block, using a goto statement, is not a recursive invocation of that block."* – ouah Jul 28 '14 at 23:22
  • If the jump went back to before a `{` then it would be undefined behaviour because the block containing `i` has execution terminated, making `p` invalid? – M.M Jul 28 '14 at 23:25
  • @MattMcNabb yes, that's my understanding – ouah Jul 28 '14 at 23:26
  • Given `{ unsigned char *p; goto LATE; EARLY: printf("%d",(int)*p); return; unsigned char i; printf("%d", (int)*p); return; LATE: p=&i; i=123; goto EARLY; }`, would the first `printf` show 123 and the second an indeterminate value, or would the the backward jump render the value of `*p` indeterminate? Note that code never actually leaves the scope of `i`. – supercat Jul 16 '15 at 18:43
  • @supercat how can you reach the second `printf` in your example? Could I suggest to ask a new question to continue the discussion? – ouah Jul 16 '15 at 19:02
9

I'll try to answer the question you may have been trying to ask.

Your program's behavior is well defined. (The return -1; is problematic; only 0, EXIT_SUCCESS and EXIT_FAILURE are well defined as values returned from main. But that's not what you're asking about.)

This program:

#include <stdio.h>
int main(void) {
    goto LABEL;
    int *p = 0;
    LABEL:
    if (p) {
        printf("%d\n", *p);
    }
}

does have undefined behavior. The goto transfers control to a point within the scope of p, but bypasses its initialization, so p has an indeterminate value when the if (p) test is executed.

In your program, the value of p is well defined at all times. The declaration, which is reached before the goto, sets p to 0 (a null pointer). The if (p) test is false, so the body of the if statement is not executed the first time. The goto is executed after p has been given a well defined non-null value. After the goto, the if (p) test is true, and the printf call is executed.

In your program, the lifetime of both p and i begins when the opening { of main is reached, and ends when the closing } is reached or a return statement is executed. The scope of each (i.e., the region of program text in which its name is visible) extends from its declaration to the closing }. When the goto transfers control backwards, the variable name i is out of scope, but the int object to which that name refers still exists. The name p is in scope (because it was declared earlier) and the pointer object still points to the same int object (whose name would be i if that name were visible).

Remember that scope refers to a region of program text in which a name is visible, and lifetime refers to a span of time during program execution during which an object is guaranteed to exist.

Normally, if an object's declaration has an initializer, that guarantees that it has a valid value whenever its name is visible (unless some invalid value is later assigned to it). This can be bypassed with a goto or switch (but not if they're used carefully).

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
  • @ryyker: Or a bad guess if that isn't actually what the OP was asking about. – Keith Thompson Jul 28 '14 at 23:19
  • "and the printf call is executed." - does it print `999` though? I don't see clear explanation of what the value of a variable is , before its declaration. (At that point `i`'s lifetime has begun but `i` is out of scope). Another example, if `return 0;` is replaced with `*p = 998;` , does it go back to 999 the second time around the "loop" ? – M.M Jul 28 '14 at 23:23
  • Instructive at the very least. And, I always enjoy your enthusiasm on anything related to `void main(void)`. ***Oh, sorry***, I meant: `int main(void)` – ryyker Jul 28 '14 at 23:24
  • *I don't see clear explanation of what the value of a variable is , before its declaration.* lifetime has begun indeed and the value of the object is indeterminate before `int i = 999;` is reached. It is specified in 6.2.4p6. – ouah Jul 28 '14 at 23:29
6

This code does not have undefined behavior. We can find a nice example in the Rationale for International Standard—Programming Languages—C in section 6.2.4 Storage durations of objects it says:

[...]There is a simple rule of thumb: the variable declared is created with an unspecified value when the block is entered, but the initializer is evaluated and the value placed in the variable when the declaration is reached in the normal course of execution. Thus a jump forward past a declaration leaves it uninitialized, while a jump backwards will cause it to be initialized more than once. If the declaration does not initialize the variable, it sets it to an unspecified value even if this is not the first time the declaration has been reached.

The scope of a variable starts at its declaration. Therefore, although the variable exists as soon as the block is entered, it cannot be referred to by name until its declaration is reached.

and provides the following example:

int j = 42;
{
   int i = 0;
 loop:
   printf("I = %4d, ", i);
   printf("J1 = %4d, ", ++j);
   int j = i;
   printf("J2 = %4d, ", ++j);
   int k;
   printf("K1 = %4d, ", k);
   k = i * 10;
   printf("K2 = %4d, ", k);
   if (i % 2 == 0) goto skip;
    int m = i * 5;
skip:
  printf("M = %4d\n", m);
  if (++i < 5) goto loop;
}

and the output is:

 I = 0, J1 = 43, J2 = 1, K1 = ????, K2 = 0, M = ????
 I = 1, J1 = 44, J2 = 2, K1 = ????, K2 = 10, M = 5
 I = 2, J1 = 45, J2 = 3, K1 = ????, K2 = 20, M = 5
 I = 3, J1 = 46, J2 = 4, K1 = ????, K2 = 30, M = 15
 I = 4, J1 = 47, J2 = 5, K1 = ????, K2 = 40, M = 15

and it says:

where “????” indicates an indeterminate value (and any use of an indeterminate value is undefined behavior).

This example is consistent with the draft C99 standard section 6.2.4 Storage durations of objects paragraph 5 which says:

For such an object that does not have a variable length array type, its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way. (Entering an enclosed block or calling a function suspends, but does not end, execution of the current block.) If the block is entered recursively, a new instance of the object is created each time. The initial value of the object is indeterminate. If an initialization is specified for the object, it is performed each time the declaration is reached in the execution of the block; otherwise, the value becomes indeterminate each time the declaration is reached.

Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740