16

I know that there are things out there roughly on this.. But my brains hurting and I can't find anything to make this work...

I am trying to send an 16 bit unsigned integer over a unix socket.. To do so I need to convert a uint16_t into two chars, then I need to read them in on the other end of the connection and convert it back into either an unsigned int or an uint16_t, at that point it doesn't matter if it uses 2bytes or 4bytes (I'm running 64bit, that's why I can't use unsigned int :)

I'm doing this in C btw

Thanks

user207421
  • 305,947
  • 44
  • 307
  • 483
Michael Crook
  • 1,520
  • 2
  • 14
  • 37
  • I'll have a try when I get home from uni :\ this semesters been hell with everything going terrible... I got so many extensions, now all my assignments are in study break... Too bad study can't get an extension =.= gotta goto uni for a physics revision discussion Thanks everone for the fast responses – Michael Crook Nov 08 '12 at 05:21
  • Are you sure you want to use *plain char* and not *unsigned char* - or even better *uint8_t*? – Andrew Nov 08 '12 at 06:31

4 Answers4

42

Why not just break it up into bytes with mask and shift?

 uint16_t value = 12345;
 char lo = value & 0xFF;
 char hi = value >> 8;

(edit)

On the other end, you assemble with the reverse:

 uint16_t value = lo | uint16_t(hi) << 8;

Off the top of my head, not sure if that cast is required.

Steven Sudit
  • 19,391
  • 1
  • 51
  • 53
  • 4
    Well, this answer saved me a lot of pain, even after I used it over a year after you wrote it. Thanks. – Marek Sep 07 '13 at 13:47
  • 8
    and a year later it is still useful, so don't stop spreading good vibe – ruisea Feb 21 '14 at 23:49
  • 3
    sorry all, I forgot all about the question... Was doing some spring cleaning on my profile and noticed this and went oopsies :( Gave this the correct answer as from the million years ago I did this, this does ring bells as how I did it... Plus everybody seems to like the answer. Sorry @StevenSudit – Michael Crook Oct 09 '14 at 02:59
  • I'm probably doing this wrong, but the reverse doesn't work for large numbers like `36345` unless I cast `lo` to a `uint8_t` first. `uint16_t value = (uint8_t)lo | hi << 8;` – teynon Jan 22 '21 at 04:14
4
char* pUint16 = (char*)&u16;

ie Cast the address of the uint16_t.

char c16[2];
uint16_t ui16 = 0xdead;
memcpy( c16, ui16, 2 );

c16 now contains the 2 bytes of the u16. At the far end you can simply reverse the process.

char* pC16 = /*blah*/
uint16_t ui16;
memcpy( &ui16, pC16, 2 );

Interestingly though there is a call to memcpy nearly every compiler will optimise it out because its of a fixed size.

As Steven sudt points out you may get problems with big-endian-ness. to get round this you can use the htons (host-to-network short) function.

uint16_t ui16correct = htons( 0xdead );

and at the far end use ntohs (network-to-host short)

uint16_t ui16correct = ntohs( ui16 );

On a little-endian machine this will convert the short to big-endian and then at the far end convert back from big-endian. On a big-endian machine the 2 functions do nothing.

Of course if you know that the architecture of both machines on the network use the same endian-ness then you can avoid this step.

Look up ntohl and htonl for handling 32-bit integers. Most platforms also support ntohll and htonll for 64-bits as well.

Goz
  • 61,365
  • 24
  • 124
  • 204
  • Ok, this will give you a char pointer into the int16, but what you find in the first element will depend on big-endian vs. small-endian. – Steven Sudit Nov 07 '12 at 22:10
  • You're right that memcpy will usually be optimized out, but this is _still_ system-dependent upon big/small-ends. – Steven Sudit Nov 07 '12 at 22:12
  • Ok, that's better, but it does seem like ntohs() is a lot of overhead for something that's pretty simple. – Steven Sudit Nov 07 '12 at 22:25
  • its really not ... on many platforms its a single instruction. On other still its just an instruction modifier ... I wouldn't worry about it, tbh. – Goz Nov 07 '12 at 22:27
  • ntohs() is a function call. Perhaps it'll be inlined by some platforms, but the overhead of any function call is huge compared to what's either a no-op or some really cheap ASM. – Steven Sudit Nov 07 '12 at 22:39
  • I downvoted and I think I already explained why. Your solutions are either platform-dependent or heavier than they need to be. There's no reason for either of these: the right answer is to shift and mask. – Steven Sudit Nov 12 '12 at 19:57
  • 1
    @StevenSudit Function call overhead is 0 if the function get inlined, and any reasonable compiler will inline a single line function, it is not "huge". – Étienne Feb 10 '14 at 16:01
  • @Étienne With all due respect, I don't believe that's correct. If ntohs is declared in an external lib, which is typically the case, then it can't be inlined. – Steven Sudit Oct 30 '14 at 04:42
  • More here: http://stackoverflow.com/questions/7505388/inline-ntohs-ntohl-in-c-boost-asio – Steven Sudit Oct 30 '14 at 04:43
2

Sounds like you need to use the bit mask and shift operators.

To split up a 16-bit number into two 8-bit numbers:

  • you mask the lower 8 bits using the bitwise AND operator (& in C) so that the upper 8 bits all become 0, and then assign that result to one char.
  • you shift the upper 8 bits to the right using the right shift operator (>> in C) so that the lower 8 bits are all pushed out of the integer, leaving only the top 8 bits, and assign that to another char.

Then when you send these two chars over the connection, you do the reverse: you shift what used to be the top 8 bits to the left by 8 bits, and then use bitwise OR to combine that with the other 8 bits.

dst2
  • 495
  • 5
  • 16
1

Basically you are sending 2 bytes over the socket, that's all the socket need to know, regardless of endianness, signedness and so on... just decompose your uint16 into 2 bytes and send them over the socket.

char byte0 = u16 & 0xFF;
char byte1 = u16 >> 8;

At the other end do the conversion in the opposite way

Gianluca Ghettini
  • 11,129
  • 19
  • 93
  • 159
  • This is correct, but the '& 0xFF' for the high byte is unnecessary. It'll be optimized away, of course, but it's redundant and might confuse humans. – Steven Sudit Nov 07 '12 at 22:40
  • You should also cast - especially as, in your example, you are placing an UNSIGNED value into CHAR which (depending on the implementation) may be signed or unsigned. Also better to use *unsigned char* or *uint8_t* – Andrew Nov 08 '12 at 06:30
  • Casting 0-0xff into a signed char can overflow, which is technically undefined behavior and should be avoided. – user4815162342 Nov 08 '12 at 22:35
  • 1
    there's no such concept of "overflow" when you want just a raw byte [0x00-0xFF] and you aren't interested in it's integer representation. Undefined behavior? what? masking with 0xFF can lead to undefined behavior? are you serious? – Gianluca Ghettini Nov 08 '12 at 22:42