Bitwise operations and endianness in C program

Question

I am having trouble trying to understand how exactly is my computer doing bitwise operations depending on endianness. I've read this thread and this article and I think I have confirmed my machine works in little endian (I have tried several programs described in both sources and all of them seem to output my machine is indeed little endian). I have the following macros defined that use SDL and swap 2 and 4 byte values in case needed:

#if SDL_BYTEORDER == SDL_BIG_ENDIAN

#define HTON16(n) (n)
#define NTOH16(n) (n)
#define HTON32(n) (n)
#define NTOH32(n) (n)
#define HTON64(n) (n)
#define NTOH64(n) (n)
#else
    
#define HTON16(n) SDL_Swap16(n)
#define NTOH16(n) SDL_Swap16(n)
#define HTON32(n) SDL_Swap32(n)
#define NTOH32(n) SDL_Swap32(n)
#define HTON64(n) SDL_Swap64(n)
#define NTOH64(n) SDL_Swap64(n)
#endif

My problem is: When writing a 2 byte number (in this case 43981 = 0xabcd) to a char[], say, at the entry 0, the following code would produce the first 2 bytes of data in little endian, i.e. 0xcdab, when I'm trying to do the opposite thing:

char data[100];
int host_value = 43981; // 0xabcd
int net_value = HTON16(host_value);

data[0] = (net_value & 0xff00) >> 8;
data[1] = (net_value & 0xff);

My solution to the previous problem is just not using HTON16 on the host value, and operating with it as if my machine was big endian. Also, in the same machine, when doing the following to write the same host value to data, it does produce data to have the two first byte set to 0xabcd:

*((unsigned short *)&data[0]) = HTON16(host_value);

I would like to understand why these two cases work differently. Any help is appreciated.

Please [edit] and show a [mcve] along with expected vs. actual output. — Jabberwocky, Apr 05 '22 at 08:39
I would need to compile SDL on a sample program reproduce the macros, this is why I did not. But to be more specific, assume theres a function with prototype: void hexdump(char* data, int len) ; that writes to stdout a buffer's content up to len. I would like, after doing my procedure, to call this function on data an len=2 and have abcd instead of cdab. — carce-bo, Apr 05 '22 at 08:42
Please note that considering `data` is `char[]`, then type punning with invalid data type in `*((unsigned short *)&data[0])` is *strict aliasing violation* which leads to *undefined behaviour*. This means that your code will not work reliably unless you set your compiler to non-standard mode. If you don't know what this all means, don't use it. Use `memcpy` instead. — user694733, Apr 05 '22 at 08:42
It is just an example, in the real code theres a pointer and theres its malloc. — carce-bo, Apr 05 '22 at 08:44
It could be... One thing I have not dig into is checking what does SDL do to check the machine's endianness. Maybe it's doing nothing and as you say it assumes big endian. — carce-bo, Apr 05 '22 at 08:47
Perhaps your confusion is that you think bitwise operations and shifts have anything to do with endianness — user253751, Apr 05 '22 at 08:59
I thought so, but I guess it does not ? It blows my mind how exactly it does not matter, if you have any reference I could read I would appreciate it. — carce-bo, Apr 05 '22 at 09:01
Think of 43981 as "just a 16-bit number". You want `data[0]` to contain the most significant 8 bits of that 16-bit number, and one way to do that is `data[0] = (43981 & 0xff00) >> 8;`. Notice the lack of use of the `HTON(n)` macro in this expression. — Ian Abbott, Apr 05 '22 at 09:21
Yes it looks like everyone is agreeing that (2bytenum & 0xff00 >> 8) gives the most significant byte on any machine. — carce-bo, Apr 05 '22 at 09:23
This is not a matter of agreement. :-D It is a matter of fact that in C the expression `(x >> 8) & 0xFF` or `(x & 0xFF00U) >> 8` evaluates to the upper byte of a 2-byte value. — the busybee, Apr 05 '22 at 09:38

score 2 · Answer 1 · answered Apr 05 '22 at 09:05

2

The problem is these two lines,

data[0] = (net_value & 0xff00) >> 8;
data[1] = (net_value & 0xff);

If you run on a little-endian machine,

(net_value & 0xff00) is equivalent to getting the second character, you should want the first character
(net_value & 0xff) is the first character,

if you use bit operations, it should be a direct operation, no need to call HTON16

answered Apr 05 '22 at 09:05

dulngix

424
1
5

Your answer is definately coherent with my problem. I would like, though, to have some references for it, since stating something without documentation backup might get me in trouble. Could you give me a reference ? – carce-bo Apr 05 '22 at 09:20
@carce-bo I doubt that any kind of reference can be found, because it's just simple bitwise math. If you do calculations by hand on paper, you should be able to show why code swaps endianness twice. – user694733 Apr 05 '22 at 09:43
You can google the keyword "Big and Little Endian Byte Order" Such as: https://betterexplained.com/articles/understanding-big-and-little-endian-byte-order/ or https://www.geeksforgeeks.org/little-and-big-endian-mystery/ – dulngix Apr 05 '22 at 10:17

Bitwise operations and endianness in C program

1 Answers1