Are XOR linked lists still allowed in C++17?

208 Views Asked by At

XOR linked lists use pointer arithmetic in a way that looks suspicious to me given the changes in semantics introduced in C++17 (and discussed e.g. in Is a pointer with the right address and type still always a valid pointer since C++17?). Do they cause undefined behavior now? If so, can they be saved with launder?

EDIT:

The Wikipedia article contains just a short note about converting between pointers and integers. I tacitly assumed (and now am making it explicitly stated) that the pointers are first converted to and integer type of sufficient size to fit them, then XORing is done on the integers. The properties of XOR listed under Theory of operation thus guarantee that only integers obtained once from pointers will be converted back to them. The actual mapping from pointers to integers can be, per the standard, an arbitrary injection. I don't rely on any assumption beyond that.

Does the standard allow to use them and access the still existing objects? Before C++17? Since C++17?

3

There are 3 best solutions below

12
Jan Schultke On BEST ANSWER

It's implementation-defined, and still valid in C++17, at least for GCC. You cannot perform an xor operation between two pointers directly; you would have to go through reinterpret_cast<std::intptr_t>. The effect of this conversion (and back) is implementation-defined.

Implementation-defined means that the compiler must document what happens. What GCC provides is:

A cast from pointer to integer discards [...], otherwise the bits are unchanged.

A cast from integer to pointer discards [...], otherwise the bits are unchanged.

When casting from pointer to integer and back again, the resulting pointer must reference the same object as the original pointer, otherwise the behavior is undefined.

See https://gcc.gnu.org/onlinedocs/gcc/Arrays-and-pointers-implementation.html

From this description, we can conclude that:

  1. the user ensures that the object at the address stays the same between storing pointers in an XOR list and retrieving them
    • if the pointed-to object changes during this process, casting the integer back to a pointer is undefined behavior
  2. converting one past the end pointers, null pointers, and invalid pointers back and forth isn't explained here, but "preserves the value" according to the C++ standard

Implications for an XOR-list

Generally, this should make XOR-lists implementable, as long as we reproduce the same pointers that we stored, and don't "rug pull" nodes while there are XORed pointers to them.

std::intptr_t ptr;

// STORE:
// - bit-cast both operands and XOR them
// - store result in ptr
ptr = reinterpret_cast<std::intptr_t>(prev) ^ reinterpret_cast<std::intptr_t>(next);

// LOAD:
// - XOR stored ptr and bit-cast to node*
node* next = reinterpret_cast<node*>(ptr ^ reinterpret_cast<std::intptr_t>(prev));
// valid dereference, because at the address 'next', we still store the same object
*next;

As stated in the documentation, next "must reference the same object as the original pointer", so we can assume that next is now a pointer to an object, if such a pointer was originally stored in ptr.

However, it would be UB if we stored the XORed next pointer, began the lifetime of a new object where next points to, and then un-XORed the address and converted back to a pointer type.

1
Marshall Clow On

As far as I know, reinterpret_cast to and from std::uintptr_t should be fine.

4
Brian Bi On

XOR linked lists are still valid in C++17 and beyond, assuming that the type uintptr_t exists.

While it's true that a pointer that represents a particular address is not necessarily a pointer to an object that resides at that address, there is [expr.reinterpret.cast]/5:

[...] A pointer converted to an integer of sufficient size (if any such exists on the implementation) and back to the same pointer type will have its original value; mappings between pointers and integers are otherwise implementation-defined. [ Note: Except as described in 6.7.4.3, the result of such a conversion will not be a safely-derived pointer value. — end note ]

What this tells us is that, while reinterpret_cast may not in general give you a pointer to an object, there is a special case where the operand is an integer value that was previously obtained from reinterpret_casting a pointer operand. The pointer value resulting from the round trip is the original pointer value, meaning that if the original pointer value pointed to an object, the result also points to that object (assuming the object still exists, of course).

But, the note tells us that, in C++17, the result of the conversion may not be a "safely-derived pointer value". What that means is that once you perform the pointer to integer conversion and do not keep a copy of that integer, but instead only store its value XOR other integers, the implementation is allowed to garbage collect the pointee because (in some technical sense that I won't get into here) the pointee is no longer "reachable". When you later reconstitute the original integer value by another XOR operation, and then attempt to reinterpret_cast it back to the original pointer, that pointer value is not "safely-derived" and is therefore considered invalid under some theoretical implementations (because the implementation might have garbage collected it already). But, if your implementation has "relaxed pointer safety", then this doesn't matter; the pointer is still valid. The design intent was that garbage-collected implementations would define themselves as having "strict pointer safety".

In practice, no implementation actually has "strict pointer safety" as specified in the standard, even though some garbage-collected C++ implementations do apparently exist. For this reason, the concept of strict pointer safety will be abolished in C++23. You can rest assured that XOR linked lists are valid, assuming that uintptr_t exists.