3

As this answer on another question covers, using an aggregate initialization

struct foo {
    size_t a;
    size_t b;
};

struct foo bar = {0};

results in built-in types being initialized to zero.

Is there any difference between using the above and using

struct foo * bar2 = calloc(1, sizeof(struct foo));

leaving aside the fact that one variable is a pointer.
Looking at the debugger we can see that both a and b are indeed set to zero for both of the above examples.

What's the difference between two above examples, are there any gotchas or hidden issues?

Community
  • 1
  • 1
Etheryte
  • 24,589
  • 11
  • 71
  • 116
  • Now I'm curious - what if we replaced `calloc` with `alloca` followed by a `memset` to zero? Then would there be any significant difference? – MooseBoys Oct 30 '14 at 22:17
  • `alloca` is non-standard. Also, `alloca` allocates on the stack, not on the heap, and it doesn't report errors (allocating more memory than is available has undefined behavior, and is likely to cause your program to crash before it can detect the error). – Keith Thompson Oct 30 '14 at 22:20
  • "it doesn't report errors" -- Neither does calling a function, which can also blow the stack. A technical wart on the standard is that it provides no guarantees on stack depth, so virtually no C program is strictly conforming. – Jim Balter Oct 30 '14 at 22:29
  • 1
    If you re looking for answers just about the initialization type then I would suggest changing the question to be "difference between {0} and memset", and doing `struct foo bar2; memset(&bar2, 0, sizeof bar2);` – M.M Oct 30 '14 at 22:29
  • 1
    "Then would there be any significant difference?" -- Both set all bits to zero, which is theoretically different from setting them to zero values (see the answers by Deduplicator and Keith Thompson). – Jim Balter Oct 30 '14 at 22:34

5 Answers5

4

Yes, there is a crucial difference (aside from storage-class of your object of type struct foo):

struct foo bar = {0};
struct foo * bar2 = calloc(1, sizeof *bar2);

Every member of bar is zero-initialized (and the padding is zeroed out for sub-object without initializer, or if bar is of static or thread_local storage-class),
while all of *bar2 is zeroed out, which might have completely different results:

Neither null-pointers (T*)0 nor floating-point-numbers with value 0 are guaranteed to be all-bits-0.
(Actually, only for char, unsigned char and signed char (as well as some of the optional exact-size-types from <stdint.h>) it is guaranteed that all-bits-0 matches value-0 till some time after C99. A later technical corrigenda guaranteed it for all integral types.)

The floating-point-format might not be IEEE754.

(On most modern systems you can ignore that possibility though.)

Cite from c-faq (Thanks to Jim Balter for linking it):

The Prime 50 series used segment 07777, offset 0 for the null pointer, at least for PL/I.

Community
  • 1
  • 1
Deduplicator
  • 44,692
  • 7
  • 66
  • 118
  • 1
    For `struct foo bar = {0};`, I don't think there's any guarantee that the padding is zeroed. It probably will be, because it's just easier to implement that way -- and there almost certainly won't be any padding in a struct consisting of two `size_t` members. – Keith Thompson Oct 30 '14 at 22:21
  • @BillyONeal I know it was before the latest edit, but not before the one that said "completely different results: Neither null-pointers (T*)0 nor floating-point-numbers with value 0 are guaranteed to be all-bits-0." So my question stands. – Jim Balter Oct 30 '14 at 22:21
  • 1
    "On most modern systems you can ignore that possibility though." -- Yes, though such machines have existed and may still: http://c-faq.com/null/machexamp.html – Jim Balter Oct 30 '14 at 22:26
  • @JimBalter: Thanks for the link, those faqs can be really great for such. – Deduplicator Oct 30 '14 at 22:27
  • @KeithThompson: Right, if `bar` is automatic storage-class, only the padding for sub-objects without initializer is zeroed out. – Deduplicator Oct 30 '14 at 22:38
  • @Jim: Erm, no. When the answer was first posted it did not have that comment. It just said "these (calloc vs automatic storage) can be completely different" without justification. Hence my comment. The answer has been since fixed. – Billy ONeal Oct 30 '14 at 22:41
  • @BillyONeal Erm, yes, it did. I saw this answer before you commented. The answer has since been elaborated, but it had that text when you asked the question. That's a fact, and the contrary is a falsehood. So my question *still* stands. – Jim Balter Oct 30 '14 at 22:44
  • 1
    @Billy: My initial post already said null-pointers and floating-0 need not be all-bits-0. I completely reformulated that though, so I doubt he actually quoted my first version ;-). – Deduplicator Oct 30 '14 at 22:44
  • @Jim: No, it did not. – Billy ONeal Oct 30 '14 at 22:46
  • 2
    So just to recap to see if I understood you correctly, `{0}` sets the struct members to what ever the appropriate bit representation of a zero value is while `calloc` simply sets all bits to zero? – Etheryte Oct 30 '14 at 22:46
  • That one sentence, "Neither null-pointers (T*)0 nor floating-point-numbers with value 0 are guaranteed to be all-bits-0." has not changed throughout. Billy: The author himself has weighed in and said it did. Rewriting your memory to avoid having made a mistake is futile. – Jim Balter Oct 30 '14 at 22:46
3

calloc gives you a heap dynamically allocated zeroed memory zone (into your bar2). But an automatic variable (like bar, assuming its declaration is inside a function) is allocated on the call stack. See also calloc(3)

In C, you need to explicitly free heap allocated memory zone. But stack allocated data is popped when its function is returning.

Rerad also wikipage on C dynamic memory allocation, and on garbage collection. Reference counting is a widely used technique in C and in C++, and could be viewed as a form of GC. Think about circular references, they are hard to handle.

The Boehm conservative GC can be used in C programs.

Notice that the liveness of a memory zone is a global program-wide property. You generally cannot claim that a give zone belongs to a particular function (or library). But you could adopt conventions about that.

When you code a function returning a heap-allocated pointer (i.e. some pointer to dynamic storage) you should document that fact and decide who is in charge of freeing it.

About initialization: a calloc pointer is zeroed (when calloc succeeds). An automatic variable initialized as {0} is also zeroed. In practice, some implementations may calloc differently big objects (by asking whole zeroed pages from the kernel for them, e.g. with mmap(2)) and small objects (by reusing, if available, a previously free-d zone and zeroing it). zero-ing a zone is using a fast equivalent of memset(3)

PS. I am ignoring the weird machines on which an all zero-bit memory zone is not a cleared data for the C standard, i.e. like {0}. I don't know such machines in practice, even if I know they are in principle possible (and in theory the NULL pointer might not be an all-zero-bit word)

BTW, the compiler may optimize an all-zero local structure (and perhaps not allocate it at all on the stack, since it would fit in registers).

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • 1
    If you're looking at the standard, dynamically allocated == object with dynamic storage duration, "on the call stack" == object with automatic storage duration. – Billy ONeal Oct 30 '14 at 22:15
  • Thanks for your answer, this question isn't so much about memory management in general, but about the functional difference of the two initializing methods. – Etheryte Oct 30 '14 at 22:17
3
struct foo bar = {0};

This defines an object of type struct foo named bar, and initializes it to zero.

"Zero" is defined recursively. All integer subobjects are initialized to 0, all floating-point subobjects to 0.0, and all pointers to NULL.

struct foo * bar2 = calloc(1, sizeof(struct foo));

IMHO this is better (but equivalently) written as:

struct foo *bar2 = calloc(1, sizeof *bar2);

By not repeating the type name, we avoid the risk of a mismatch when the code is changed later on.

This dynamically allocates an object of type struct foo (on the heap), initializes that object to all-bits-zero, and initializes bar2 to point to it.

calloc can fail to allocate memory. If it does, it returns a null pointer. You should always check for that. (The declaration of bar also allocates memory, but if it fails it's a stack overflow, and there's no good way to handle it.)

And all-bits-zero is not guaranteed to be the same as "zero". For integer types (including size_t), it's very nearly guaranteed. For floating-point and pointer types, it's entirely legal for 0.0 or NULL to have some internal representation other than all-bits-zero. You're unlikely to run into this, and since all the members of your structure are integer you probably don't need to worry about it.

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
  • Thank you for your answer, also, thanks for the tip on allocation syntax. Could you please elaborate under what circumstances would your last paragraph be a problem? The struct shown in the question is just a random example. – Etheryte Oct 30 '14 at 22:21
  • 4
    @Nit: In practice, it's not likely to be a problem on any system you're likely to use. I don't think I've ever actually seen a C implementation that uses anything other than all-bits-zero for null pointers or floating-point `0.0`. On the other hand, as a matter of style, I like to write code that's as portable as practical (but no more so). Very often more portable code turns out to be simpler, and it gives you one less thing to worry about. – Keith Thompson Oct 30 '14 at 22:33
  • "it's very nearly guaranteed" -- What's the exception? – Jim Balter Oct 30 '14 at 22:41
  • @JimBalter: The standard guarantees (starting with one of the post-C99 technical corrigenda) that all-bits-zero is a representation of `0` for any integer type. That doesn't mean it's the *only* representation of `0`. In principle, `struct foo bar = {0};` *could* set `bar.a` and `bar.b` to a different representation of `0` that `calloc()` does. This is vanishingly unlikely. Prior to that C99 TR, it was possible in principle that all-bits-zero wouldn't even be a representation of `0`. The committee presumably felt free to add the explicit requirement because ... – Keith Thompson Oct 30 '14 at 23:07
  • ... no existing implementation actually did something that bizarre. there's probably no reason for anyone other than a language lawyer to worry about it. (Pointers and floating-opint, on the other hand ...) – Keith Thompson Oct 30 '14 at 23:08
  • The difference is allocation on stack vs allocation from the heap. All the rest is irrelevant nonsense. – littleadv Oct 31 '14 at 05:23
  • What about the padding? Is it initialized to 0? I tend to cast stuff every now and then. –  Mar 10 '20 at 23:02
  • 1
    @traducerad `calloc`, like `memset`, will initialize everything to all-bits-0, including any padding. But "When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values." -- C11 6.2.6.1 paragraph 6. Normally padding bytes don't matter, so the language doesn't go out of its way to guarantee their stability. Think of them as fragile. – Keith Thompson Mar 10 '20 at 23:10
1

(This answer focuses on the differences in initialization, in the case of a struct only containing integral types)

Both forms set a and b to 0. This is because the Standard defines that all-bits-zero for an integral type must represent a value of 0.

If there is structure padding, then the calloc version sets that but the zero-initialization may not. For example:

struct foo a = { 0 }, b = { 0 };
struct foo c, d; memset(&c, 0, sizeof c); memset(&d, 0, sizeof d);

if ( memcmp(&a, &b, sizeof a) )
    printf("This line may appear.\n");

if ( memcmp(&c, &d, sizeof c) )
    printf("This line must not appear.\n");

A technique you will sometimes see (especially in code designed to fit on systems with small amounts of storage) is that of using memcmp to compare two structs for equality. When there is padding between structure members, this is unreliable as the padding may be different even though the structure members are the same.

The programmer didn't want to compare structure members individually as that is too much code size, so instead, he will copy structs around using memcpy, initialize them using memset; in order to preserve the ability to use memcmp to check for equality.


In modern programming I'd strongly advise to not do this; and to always use the { 0 } form of initailization. Another benefit of the latter is that there is no chance of making a mistake with the size argument and accidentally setting too much memory or too little memory.

M.M
  • 138,810
  • 21
  • 208
  • 365
0

There is a serious difference: allocation of automatic variables is done at compile-time and comes for free (when the stack frame is reserved, the room is there.) On the opposite, dynamic allocation is done at run-time and has an unpredictible and non neglectible cost.

As regards initialization, the compiler has opportunities for optimization with automatic variables (for instance by not clearing if unnecessary); this is not possible with a call of calloc.

If you like the calloc style, you also have the option of performing memset on the automatic variable.

memset(&bar, 0, sizeof bar);

UPDATE: allocation of automatic variables is quasi-done at compile-time.

  • 1
    -1 Neither allocation nor initialization of auto variables occurs at compile time ... how could it? " this is not possible with a call of calloc" -- yes, it is, since calloc is a standard function and its behavior is specified by the standard, so compilers can count on it. – Jim Balter Oct 30 '14 at 22:38
  • My downvote is sincere and appropriate, and my explanation for it is valid. That you don't welcome it is irrelevant, other than being against the spirit of SO. " if strictly speaking the compiler does not allocate memory space" -- it's a **fact** that the compiler does not allocate memory space, it only *generates code* that allocates that space. Ditto for initialization. To call this "strictly speaking" is to suggest that to say *anything* happens at runtime is "strictly speaking" because the compiler generated the code at compile time. – Jim Balter Oct 30 '14 at 22:56