0

In C language, is compiler/linker obligated to put the same string literal used multiple times into single memory location?

const char* a = "abcdef";
const char* b = "abcdef";

//Compare pointers
if (a == b) {
    printf("True\r\n");
} else {
    printf("False\r\n");
}

In this case, is statement above always true (in MSVC 2017 it is) or is it undefined behavior in general?

Can it happen that we have string literal abcdef in 2 different memory locations? Of course, I do not count strings like 12abcdefgh˛where abcdef is part of string, but standalone.

unalignedmemoryaccess
  • 7,246
  • 2
  • 25
  • 40

2 Answers2

2

What you are comparing here are pointers to the string literals. String literals are statically allocated in some memory, which is not supposed to be written. An optimizing compiler might allocate the same string literal only once since it is assuming these will never change and there is no point of having more than a single copy, and this is what you are observing here. But it is not guaranteed.

Eugene Sh.
  • 17,802
  • 8
  • 40
  • 61
  • 1
    It can also perform other transformations for deduplicating string literal content. For example, `"hello world"+6 == "world"` might be true. – R.. GitHub STOP HELPING ICE Sep 10 '18 at 21:19
  • @R.. That is something I've never seen.. (maybe because never checked :) ) – Eugene Sh. Sep 10 '18 at 21:20
  • To be clear, the comment fom @R.. is not a string comparison, but a pointer comparison. – Weather Vane Sep 10 '18 at 21:21
  • 1
    @WeatherVane: There's no such thing as "string comparison" in C, unless you mean `strcmp`, and of course `strcmp("hello world"+6, "world")==0` is always true. – R.. GitHub STOP HELPING ICE Sep 10 '18 at 21:22
  • 2
    @R.. obviously, that is what I thought ***you*** meant. I was enlarging on your true but unclear comment. – Weather Vane Sep 10 '18 at 21:23
  • it will be only the truth in this trivial example if you change the type of one of the variables two copies of the literal will be placed in the `.rodata` section. Example: `const char *a = "abcdef"; const char b[] = "abcdef";` – 0___________ Sep 10 '18 at 21:40
  • @P__J__: I think what you're saying is that named objects have unique addresses/cannot coincide, and that's correct. String literals are special in that they need not have unique addresses. – R.. GitHub STOP HELPING ICE Sep 10 '18 at 22:22
  • Actually the compilers in this case duplicate the literal. Theoretically it is not necessary: one vatiable keeps the reference, another the copy. – 0___________ Sep 10 '18 at 22:28
1

It is unspecified behavior but most compilers will optimize this such that it is true. If you are coding to a specific compiler, you may find a switch or #pragma that controls folding such strings and therefore give you something deterministic.

user3386109
  • 34,287
  • 7
  • 49
  • 68
  • 2
    "undefined behaviour" is the wrong term; I suppose behaviour here is "implementation defined" – Stephan Lechner Sep 10 '18 at 21:24
  • 1
    @StephanLechner As long as we're nitpicking, it's actually "unspecified behavior". See annex J.1 and section 6.4.5. – user3386109 Sep 10 '18 at 22:42
  • Is there any guarantee that any *particular* string literal used within the code (after macro expansion) will always have a constant address, e.g. that given `inline char const *foo(void) { return "Hey";}`, every call to `foo` within a translation unit would yield the same address? – supercat Sep 20 '18 at 21:22
  • yes, string literals are constant addresses of arrays that contain the characters followed by a NUL terminator – Mike McCartney Sep 22 '18 at 00:15