Evaluating sizeof long array of SIZE_MAX elements

208 Views Asked by At

Consider the following code:

#include <stdio.h>
#include <limits.h>
#include <inttypes.h>
#include <stddef.h>

int main(){
    size_t cnt = SIZE_MAX;
    size_t sz = sizeof(long[cnt]);
    printf("%zu\n", sz);
}

6.5.3.4/p2:

If the type of the operand is a variable length array type, the operand is evaluated; otherwise, the operand is not evaluated and the result is an integer constant.

The question is if such too large sizeof evaluation well defined? Since size_t is unsigned, the Standard guarantees that unsigned integer overflow has well-defined behavior (unlike signed where implementation defined signal might be raised).

The main issue I'm confused about is that

size_t sz = sizeof(long[SIZE_MAX]); //error: size of unnamed array is too large

does not even compile Godbolt live example

4

There are 4 best solutions below

15
ad absurdum On BEST ANSWER

sizeof (long[SIZE_MAX]) won't compile because attempting to form the type long[SIZE_MAX] is a constraint violation. From §6.2.5 28 of the C23 draft standard:

A complete type shall have a size that is less than or equal to SIZE_MAX.

The constraint in question is not listed under a "Constraints" heading, so compilers are not required to issue a diagnostic for this. In this case both GCC and Clang choose to fail and issue an error message, but more generally sizeof (long[SIZE_MAX]) has undefined behavior since it violates a "shall" outside of an explicit constraints clause. But I'd like to think that reasonable implementations would fail to compile with an error like this when an attempt to declare an array which cannot be supported is made.

It appears that this language did not appear in previous standards, but the Standards Committee determined "...that all interpret the current standard that huge objects make the behavior implicitly undefined." The Committee views this change not as introducing an undefined behavior, but as a clarification that makes this explicit.

8
Ted Lyngmo On

If the type of the operand is a variable length array type, the operand is evaluated; otherwise, the operand is not evaluated and the result is an integer constant.

Yes, it's well defined. It performs the sizeof(element)*number_of_elements calculation at runtime for variable length arrays. It doesn't matter that the result is large.

And Barmar correctly mentioned:

while it may be well defined, it might not be useful.

7
0___________ On

size_t is the type which must be large enough to accomodate the size of the largest possible object in your implementation. So it is well defined.

But if the real calculated size of your object (assuming infinitive integer size) is > SIZE_MAX it can't be created or used in your program. Then it is completely useless.

0
John Bode On
The main issue I'm confused about is that
size_t sz = sizeof(long[SIZE_MAX]); //error: size of unnamed array is too large
does not even compile

SIZE_MAX is a constant expression and can be evaluated during translation; it is the largest value that can be represented by size_t, so the compiler knows that an array of SIZE_MAX elements of any type other than char will exceed SIZE_MAX bytes and reject the code. It can do this for any constant expression whose value is greater than SIZE_MAX / sizeof (long).

cnt is not a constant expression, so sizeof( long[cnt] ) isn't evaluated until runtime and the compiler can't flag it as a problem.

So what happens?

6.2.5 Types

...

9 The range of nonnegative values of a signed integer type is a subrange of the corresponding unsigned integer type, and the representation of the same value in each type is the same.41) A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.

So, basically, size_t sz = sizeof( long[cnt] ) will give you some value that's mod'ed by SIZE_MAX. It won't be the size of the array in bytes; it will be some value that can fit in a size_t. Well-defined, but not terribly useful.