3

I am struggling with bit masking or maybe handling bit overflow.

I get data from data stream, and it is stored in char type buffer, and I need to access particular index of the buffer. When I try so, I get unexpected results.

Something like below..

char buffer[BUFFER_SIZE];

/* ...recv from network stream performed... */

printf("buffer[index] = 0x%x\n", buffer[index]); /* => 0xFFFFFFB8 */

char dummyChar = buffer[index] & 0x000000FF;
printf("dummyChar = 0x%x\n", dummyChar); /* => 0xFFFFFFB8 */

The buffer is char type. Why am I getting 32 bits size when I print buffer[index]?

I also masked the buffer[index] with 0x000000FF, but I am still getting 0xFFFFFFB8. Why?

I only want to get 0xB8, can anyone tell me how? I am pretty sure it is 1 byte size..

tversteeg
  • 4,717
  • 10
  • 42
  • 77

2 Answers2

4

There are two reasons: The first is that when passing an integer type smaller than int it is automatically promoted to int (which possibly sign-extends the value). The second reason is because the printf format "%x" is for printing unsigned int values, not bytes.

For unsigned bytes (unsigned char) you should use the modifier hh as in "%hhx". Then printf will know that the passed value is really a single unsigned byte and not a full unsigned int.

Please see the above linked reference for more details.

Some programmer dude
  • 400,186
  • 35
  • 402
  • 621
  • I tried your suggestion, but it didn't work out. And not all of index of the buffer prints with 0xFFFFFF thing. some bytes are printed in correct way. and, when I casted (int)(buffer[index] & 0x000000FF), it gave me 0xB8... – Park Chansung Sep 27 '16 at 08:40
  • 1
    Your link says that `"%x"` requires an `unsigned int`, not an `int`. – mch Sep 27 '16 at 09:00
2

The character that you're trying to mask, buffer[index], has a negative value of -72, since the type char is signed on your machine. It's hexadecimal representation is 0xB8.

When the argument buffer[index] is passed to the printf function, integer promotion is performed on it. This means the its type, which is char, is promoted to type int. The representation of -72 in the type int differs from the representation of -72 in type char, on your machine:

In type char, -72 is represented as 0xB8.

In type int, -72 is represented as 0xFFFFFFB8

Performing the and bitwise operation won't change the value:

char dummyChar = buffer[index] & 0x000000FF;

because1: 0xFF & 0xB8 == 0xB8. The character dummyChar will have the same value of -72. Then dummyChar is passed to the printf call and the the same promotion procedure happens as described above.


1 The actual reality is slightly different since integer promotions also happen with bitwise operators, but the outcome is the same. Both types are promoted to int, then the bitwise and is performed and then the result is implicitly converted to char:

0xFFFFFFB8 & 0x000000B8 == 0x000000B8
0x000000B8 == 0xB8

Leaving the value the same as it were before the operation.

2501
  • 25,460
  • 4
  • 47
  • 87
  • 1
    It's undefined behaviour to use `%x` with a negative value – M.M Sep 27 '16 at 09:12
  • That is debatable. I think I would agree, but that is not the topic of this question. – 2501 Sep 27 '16 at 11:04
  • @M.M, no, it isn't, it's perfectly defined behaviour. `-72` is converted to the integer `-72` and `-72` in hex is `0xFFFFFFB8`. – Luis Colorado Sep 27 '16 at 19:19
  • @LuisColorado You're wrong, the standard clearly says that `%x` must be supplied an `unsigned int` argument. These comments are not the place for an extended discussion. [See here](http://stackoverflow.com/questions/4664100/) for related discussion – M.M Sep 27 '16 at 22:17
  • Well, I'm afraid that `%x` was in the scenes clearly before men were distinguishing between differences in signed and unsigned `int`s. You are citing references that are newer than the first appearance of the `%x` format, and it's a pity to have to comply with old behaviour (being it defined or undefined). – Luis Colorado Sep 28 '16 at 09:48
  • @M.M You are correct that using negative values for `"%x"` is UB, however, it is allowed to use `int` when it is not negative. See footnote 31 in C99 or footnote 41 in C11. "The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions." – 12431234123412341234123 Sep 18 '20 at 08:08
  • @12431234123412341234123 footnotes are non-normative – M.M Sep 18 '20 at 20:30