1

It is possible to find architectures where the char data type is represented on 8 bytes, so 64 bits, the same as long long and in the same time the Standard requires the CHAR_MIN and CHAR_MAX to be bound -- see 5.2.4.2.1 Sizes of integer types <limits.h> from the Standard ISO 9899.

I cannot figure out why these architectures chose to represent the char so and how does it represent char values on so a large space. So how char values are represented in such a case ?

sizeof(char)=1 all the time. My question is, what is the value of sizeof(long long) and sizeof(int) on such an architecture ?

alinsoar
  • 15,386
  • 4
  • 57
  • 74

2 Answers2

4

It is possible to find architectures where the char data type is represented on 8 bytes

No. That's because a char is defined to be a byte *). But a byte doesn't necessarily have 8 bits. That's why the term octet is sometimes used to refer to a unit of 8 bits. There are architectures using more than 8 bits in a byte, but I doubt there's one with a 64bit byte, although this would be theoretically possible.

Another thing to consider is that char (as opposed to many other integer types) isn't allowed to have padding bits, so if you ever found an architecture with 64bit chars, that would mean CHAR_MIN and CHAR_MAX would be "very large" ;)


*) In fact, a byte is defined to be the unit of memory used to represent an encoded character, which is normally also the smallest addressable unit of the system. 8 bits are common, The wikipedia article mentions byte sizes up to 48 bits were used. This might not be the best source, but still, finding a 64bit byte is very unlikely.

  • 1
    The exact-width integer types are not allowed to have padding bits either – ad absurdum Oct 04 '17 at 07:56
  • I heard Robert Seacord saying so. So this is possible. – alinsoar Oct 04 '17 at 07:56
  • 1
    @DavidBowling don't want to mention this explicitly (too far from the question), but I added a little word to clarify ;) thanks. –  Oct 04 '17 at 07:59
  • if `char` is represented on 48 bits and `int` on 24, what is `sizeof(int)` ? – alinsoar Oct 04 '17 at 08:05
  • 1
    @alinsoar that's impossible. `char` (a byte) is **always** the smallest unit. –  Oct 04 '17 at 08:09
  • ok, so if char is represented on 64 then all the other types are repr. on 64, such as the integer conversion rank to be sound? – alinsoar Oct 04 '17 at 08:11
  • in case char is represented on 32 and int is represented on 48 we will have 16 padding bits for each integer variable ? – alinsoar Oct 04 '17 at 08:17
  • 1
    yes, and `sizeof(int)` would still be 2 ... 48 value bits and 16 padding bits. –  Oct 04 '17 at 08:18
3

It is possible to find architectures where the char data type is represented on 8 bytes,

I don't know any. BTW, it is not only a matter of architecture, but also of ABI. BTW, you don't define what is a byte, and the bit size of char-s matters much more.

(IIRC, someone coded a weird implementation of C in Common Lisp on Linux/x86-64 which has 32 bits char-s; of course its ABI is not the usual Linux one!)

sizeof(char)=1 all the time. My question is, what is the value of sizeof(long long) and sizeof(int) on such an architecture ?

It probably would be also 1 (assuming char, int, long long all have 64 bits) unless long long is e.g. 128 bits (which is possible but unusual).

Notice that the C standard imposes minimal bounds and bit sizes (read n1570). E.g. long long could be wider than 64 bits. I never heard of such C implementations (and I hope that when 128 bits processors become common, C will be dead).

But your question is theoretical. I know no practical C implementation with 64 bits char-s or wider than 64 bits long long. In practice assuming that char-s are 8 bits (but they could be signed or unsigned, and both exist) is a reasonable, but non universal, assumption.

Notice that C is not a universal programming language. You won't be able to code a C compiler for a ternary machine like Setun.

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • to be more precise, I heard about such case in the slides of the lectures of Robert Seacord, but he does not provide much details. – alinsoar Oct 04 '17 at 07:32
  • `unsigned long long` has a bound of ``ULLONG_MAX = 18446744073709551615 = 2^64 − 1``... if we represent it on 128 bits how are the rest of 64 bits used ? – alinsoar Oct 04 '17 at 07:38
  • IIRC the standard defines *minimal* bounds. They could be greater! – Basile Starynkevitch Oct 04 '17 at 07:39
  • no, no, it defines min and max -- pasted: maximum value for an object of type unsigned long long int ULLONG_MAX 18446744073709551615 // 2 64 − 1 – alinsoar Oct 04 '17 at 07:39
  • look at `5.2.4.2.1 Sizes of integer types ` – alinsoar Oct 04 '17 at 07:41
  • ``It probably would be also 1 (assuming char, int, long long all have 64 bits) unless long long is e.g. 128 bits (which is possible but unusual).`` Ok, how much is the `sizeof(int)` supposing int is represented on 24 bits in this case? – alinsoar Oct 04 '17 at 07:42
  • 1
    ah, it is written there -- `` Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown, with the same sign.``.. I think this statement clarifies the things a lot. – alinsoar Oct 04 '17 at 07:43
  • Yes, that's the one that had me scratching my head too "*Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown, with the same sign.*" Oh joy, more certainty... – David C. Rankin Oct 04 '17 at 07:46
  • I think, if char is represented on 64 then all the other types are represented on 64 bits and for each other type sizeof(T)=1. Correct ? – alinsoar Oct 04 '17 at 08:10
  • I believe that 64 bits `char`s could live with 64 bits `short`-s and 128 bits `long`-s. But your question is theoretical. I hope that by the time most processors would be 128 bits the C language will be dead. – Basile Starynkevitch Oct 04 '17 at 08:12
  • If memory serves, at least one iteration of Unix-for-Cray-supercomputers had an ABI with `CHAR_BIT==64` and `sizeof(long)==sizeof(int)==sizeof(short)==sizeof(char)==1`. But that was a very long time ago -- possibly before C89, even. – zwol Nov 22 '17 at 16:05