4

Say I have an unsigned char (or byte) array. I want to take array[1] and array[2] from memory and cast it as short int (2 bytes). Something similar to how a union works, but not starting from the first byte.

Example:

#include <stdio.h>

void main()
{
    unsigned char a[28];
    unsigned short t;

    a[0]=12;
    a[1]=10;
    a[2]=55;

    t=(short) *(a+1);

    printf("%i", t);
}

What I want is the value 14090 in decimal. Or 370Ah.

Thank you.

EDIT: I forgot to say, but most of you understood from my example, I am working on a little-endian machine. An 8bit Atmel microcontroller.

Dragos Puri
  • 107
  • 1
  • 6

3 Answers3

7

It's very simple:

unsigned short t = (a[2] << 8) | a[1];

Note, this assumes unsigned char is 8 bits, which is most likely the case.

Fiddling Bits
  • 8,712
  • 3
  • 28
  • 46
  • 1
    This is machine alignment independent - good answer. Perhaps endianess could be addressed though, for good measure, as which byte is low and high could vary. – Michael Dorgan Jun 01 '15 at 18:31
  • 2
    there is no endianness problem, because the Op already identified the order of bytes for his purposes, and approach in this answer will therefore always work by definition – Les Jun 01 '15 at 18:37
4

The memory access operation (short)*(a+1) is not safe.

If a+1 is not aligned to short (i.e., a+1 is not a multiple of sizeof short), then the result of this operation depends on the compiler at hand.

Compilers that support unaligned load/store operations can resolve it correctly, while others will "round it down" to the nearest address which is aligned to short.

In general, this operations yields undefined behavior.

On top of all that, even if you know for sure that a+1 is aligned to short, this operation will still give you different results between Big-Endian architecture and Little-Endian architecture.

Here is a safe way to work-around both issues:

short x = 0x1234;
switch (*(char*)&x)
{
    case 0x12: // Big-Endian
        t = (a[1] << 8) | a[2]; // Simulate t = (short)*(a+1) on BE
        break;
    case 0x34: // Little-Endian
        t = (a[2] << 8) | a[1]; // Simulate t = (short)*(a+1) on LE
        break;
}

Please note that the code above assumes the following:

  • CHAR_BIT == 8
  • sizeof short == 2

This is not necessarily true on every platform (although it is mostly the case).

barak manos
  • 29,648
  • 10
  • 62
  • 114
  • Another possible result is a SIGSEGV, or more likely, a SIGBUS from a misaligned access. – Andrew Henle Jun 01 '15 at 19:29
  • OP implies `a[2]` is more significant that `a[1]` with `a[1]=10; a[2]=55;` and " want ...370Ah". – chux - Reinstate Monica Jun 01 '15 at 19:31
  • @chux: I agree that in an array there is no such thing as MSB and LSB, and that one has to explicitly state their positions. However, hypothetically, if `(short) *(a+1)` was a safe operation, then its equivalent on BE would be `(a[1] << 8) | a[2]`. That is why I emphasized `Simulate`. – barak manos Jun 01 '15 at 19:37
  • @AndrewHenle: It really depends on the platform (designated compiler + underlying HW architecture), but I think that at least in some cases, it just loads from or stores into the nearest aligned address. If that address is not mapped, then that will obviously cause a bus-fault. I think that if the address **is** mapped, then at least on some platforms you'd just get the wrong value (which might lead to some weird stuff further ahead in the execution of the program, and this is of course even worse than an immediate runtime exception, because by then, you have no idea "where it came from"). – barak manos Jun 01 '15 at 19:42
  • Assuming OP truly wants "370Ah", then the answer should be _independent_ of the endian of the `short`. IOWs `t = (a[2] << 8) | a[1];`. – chux - Reinstate Monica Jun 01 '15 at 20:03
2
t= *(short *)(a+1);

You cast the pointer to the first element to a pointer-to-short, and then dereference it.

Note that this is not very portable, and can go wrong if the machine is big endian or aligns data somehow. A better way would be:

t = (a[2] << CHAR_BIT) | a[1];

For full portability, you should check your endianness and see which byte to shift, and which one not to. See here how to check a machine's endianness

Community
  • 1
  • 1
mtijanic
  • 2,872
  • 11
  • 26