3
uint32_t u32 = 0;
uint16_t u16[2];
static_assert(sizeof(u32) == sizeof(u16), "");
memcpy(u16, &u32, sizeof(u32)); // defined?
// if defined, how to we access the data from here on?

Is this defined behaviour? And, if so, what type of pointer may we use to access the target data after the memcpy?

Must we use uint16_t*, because that suitable for the declared type of u16?

Or must we use uint32_t*, because the type of the source data (the source data copied from by memcpy) is uint_32?

(Personally interested in C++11/C++14. But a discussion of related languages like C would be interesting also.)

Aaron McDaid
  • 26,501
  • 9
  • 66
  • 88
  • It does now :-) (I had forgotten an `&`, and another typo) – Aaron McDaid Sep 20 '16 at 13:17
  • 1
    Are you trying to find out the endianness of your CPU? – Bob__ Sep 20 '16 at 13:22
  • @Bob__, I have no particular application in mind at the moment. But I will say that, depending on the answer, it might allow some aggressive optimizations I might consider in future in certain projects. – Aaron McDaid Sep 20 '16 at 13:24
  • @Bob__, maybe this better communicates what I am hoping to confirm: "*Given a variable declared to have type `X`, it will **always** be legal to access it via a pointer of type `X`. And `memcpy` into such a variable is always cool and won't change the type, as long as you stick to simple types where you can reason about the bit patterns (integers, pointers, ...) and don't have a problem with trap representations or alignment"*. – Aaron McDaid Sep 20 '16 at 13:30

2 Answers2

2

Is this defined behavio[u]r?

Yes. memcpying into a pod is well-defined and you ensured that the sizing is the correct.

Must we use uint16_t*, because that suitable for the declared type of u16?

Yes, of course. u16 is an array of two uint16_ts so it must be accessed as such. Accessing it via a uint32_t* would be undefined behavior by the strict-aliasing rule.

It doesn't matter what the source type was. What matters is that you have an object of type uint16_t[2].


On the other hand, this:

uint32_t p;
new (&p) uint16_t(42);
std::cout << p;

is undefined behavior, because now there is an object of a different type whose lifetime has begin at &p and we're accessing it through the wrong type.

Barry
  • 286,269
  • 29
  • 621
  • 977
  • One thing that worries me is that there are sentences in the C standard that suggest to me that the type is `uint32_t`, for example "[the type becomes ] ... the effective type of the object from which the value is copied ... ". I don't have any standards documents to hand, all I can do is copy and paste stuff that I find online - I found that sentence here: http://stackoverflow.com/questions/30970251/what-is-the-effective-type-of-an-object-written-by-memset – Aaron McDaid Sep 20 '16 at 13:26
  • @AaronMcDaid C++ is not C. – Barry Sep 20 '16 at 13:45
  • Except when C++ explicitly states that it *is* C, in some contexts. (See the answer from user2079303). – Aaron McDaid Sep 20 '16 at 13:46
  • Has this any advantage over a `reinterpret_cast`? Both should lead to implementation defined results, I think. – Bob__ Sep 20 '16 at 13:53
  • @AaronMcDaid I'm just going to repeat that C++ isn't C. `memcpy` doesn't begin the lifetime of an object, only acquiring storage and (where necessary) calling a constructor does. Your `uint16_t` array contains two `uint16_t`s. There is no object of type `uint32_t` there. – Barry Sep 20 '16 at 13:54
  • @Bob__ `reinterpret_cast` is UB. – Barry Sep 20 '16 at 13:54
  • 1
    @AaronMcDaid I should clarify that 6.5/6 from C standard that you and I quote do not apply to C++. 7.24 and few other chapters explicitly stated do apply to C++. – eerorika Sep 20 '16 at 14:09
1

The C++ standard delegates to C standard:

The contents and meaning of the header <cstring> are the same as the C standard library header <string.h>.

The C standard specifies:

7.24.1/3 For all functions in this subclause, each character shall be interpreted as if it had the type unsigned char (and therefore every possible object representation is valid and has a different value).

So, to answer your question: Yes, the behaviour is defined.

Yes, uint16_t* is appropriate because uint16_t is the type of the object.


No, the type of the source doesn't matter.

C++ standard doesn't specify such thing as object without declared type or how it would behave. I interpret that to mean that the effective type is implementation defined for objects with no declared type.

Even in C, the source doesn't matter in this case. A more complete version of quote from C standard (draft, N1570) that you are concerned about, emphasis mine:

6.5/6 [...] If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. [...]

This rule doesn't apply, because objects in u16 do have a declared type

eerorika
  • 232,697
  • 12
  • 197
  • 326
  • A small follow up: the 'declared type' of an object, if it has one, can never change? And an object can never lose or gain a 'declared type'? But the 'effective type' can change in certain circumstances? – Aaron McDaid Sep 20 '16 at 13:43
  • 1
    @AaronMcDaid declared type doesn't change. If object has declared type, then it is the effective type as well - and it doesn't change. Effective type can be gained or changed, if there is no declared type. – eerorika Sep 20 '16 at 13:49
  • I now think that a huge proportion of the strict-aliasing answers on other questions on Stack Overflow would make a lot more sense if these comments here about 'declared' and 'effective' type were prefaced to the start of every answer. – Aaron McDaid Sep 20 '16 at 14:14
  • 1
    @AaronMcDaid: I know of no definitions for "declared type" and "object" which will be consistent with all existing compiler behaviors. If a function receives a pointer of type `foo*`, compilers may treat its target as though it has a declared type of `foo`, regardless of whether it actually does or not. – supercat Sep 22 '16 at 15:51