I want to be able to handle pointers to objects with an array member of unknown size, and access that array through a type-erased pointer to their common first member. My current attempt is the following:
#include <cstddef>
struct node_base
{
node_base* next;
int size;
};
template <int n>
struct node
{
node_base base;
char data[n];
node() : base{nullptr, n} {
static_assert(offsetof(node, data) == sizeof(node_base));
}
};
void process_queue(node_base* head)
{
while (head)
{
for (int i = 0; i < head->size; ++i)
{
*(reinterpret_cast<char*>(reinterpret_cast<char*>(head) + sizeof(node_base)) + i) = i;
}
head = head->next;
}
}
int main()
{
node<3> a{};
node<4> b{};
node<2> c{};
c.base.next = &b.base;
a.base.next = &c.base;
process_queue(&a.base);
return a.data[2] + c.data[1];
}
This code builds up a queue-like structure (nodes a,b and c pointing to each other as "a -> c -> b"), and passes a pointer to the first element to process_queue. That function will then traverse the queue and access the node<n>::data array stored directly after the node_base member, and write the values 0...n-1 into its entries.
The challenge is that the nodes have different types, so the queue's next pointer point to the node_base members of the actual nodes, and I need some what to get to the actual data from there.
Although this seems to work (Godbolt) in the sense that it successfully returns expect value of 3, I am not sure whether this is allowed.
Question
Assuming I know by some method that the pointer cur points to the first member of an object with an array of size cur->size, is it legal to access the elements of the node<n>::data array by means of the code above? If not, can it be made legal without making sizeof(node<n>) larger?
Going strictly by the current standard it is undefined behavior already because the pointer arithmetic here:
is undefined.
headis a pointer to anode_baseobject, which is not pointer-interconvertible with anycharobject at the location. Thereforereinterpret_cast<char*>(head)will also be a pointer to the samenode_baseobject. As a consequence pointer arithmetic is undefined because the pointed-to type of the expression (char) is not similar to the actual type of the pointed-to object (node_base).However, your intent with the cast to
char*here is to change the pointer value. You intent to obtain a pointer to the object representation of thenode<n>object. Casts tochar*are commonly used to access object representation, but the standard doesn't actually provide for that.The proposal P1839 attempts to incorporate this intended behavior into the standard. With its current wording in revision P1839R5 it would still not make your program well-defined, for multiple reasons:
First, because only
reinterpret_cast<unsigned char*>would be possible to obtain a pointer to the object representation, as noted in the limitations section of the proposal.Even with
unsigned char, there is still issues under the proposal:Your classes happen to be standard-layout. That's a necessary condition for this to work at all. If they weren't standard-layout, then there generally wouldn't be any way to get from a pointer to one member to a pointer to another member.
But being standard-layout guarantees that the
node<n>object is pointer-interconvertible with its first member subobject. As a consequence, under the proposal it is left open whetherreinterpret_cast<unsigned char*>(head)will produce a pointer to the first element of the object representation of thenode_basemember or of thenode<n>object. This is noted as an open issue in the proposal.Assuming it did however produce a pointer to the object representation of the
node<n>object as you intent, then the next question would be whetherreinterpret_cast<unsigned char*>(head) + sizeof(node_base)) + iwould be pointer into the object representation of thechararray member ofnode<n>as well. I am not sure know what the proposal intents for this.But even if that wasn't an issue, the proposal defines only how it is possible to read from the object representation. Writing to it is out-of-scope and still UB under the proposal.
So at the very least you would need to keep the outer
reinterpret_cast<char*>and wrap it in a call tostd::launderin order to obtain a pointer to thecharobject itself (rather than its object representation or the object representation of thenode<n>object).