I'm reviewing some code (can't post all of it), but there's a function like this:
template <typename DestType, typename SourceType>
inline void transferDataAndUpdateSpan(MyArray<DestType>& to, MySpan<const SourceType>& source)
{
static_assert(sizeof(DestType) == sizeof(SourceType), "Data size mismatch!!");
to.resize(source.size());
memcpy(to.data(), source.data(), sizeof(SourceType) * source.size());
source = { (SourceType*)to.data(), to.size() };
}
MySpan is basically a typedef for std::span and MyArray is a container which has a constructor which receives a pointer to the data and the data size.
Question: isn't source = { (SourceType*)to.data(), to.size() }; breaking strict aliasing here?
Is this triggering UB?
First of all, the
memcpyitself has undefined behavior ifDestType(andSourceType?) aren't trivially-copyable or if the object representations in theSourceTypeobjects aren't valid object representations for values ofDestType.A safer way to transferring the object representations would be to assign the result of
std::bit_castfrom the source element to the target element in a loop. It would at least verify trivial-copyability and would also include the size check you do manually at the moment.Then, you say "MyArray is a container which has a constructor which receives a pointer to the data and the data size": But the constructor isn't used anywhere. You are just copying object representations. So hopefully
to.data()is actually a pointer into an array ofDestTypeobjects into whichmemcpycan copy the object representations.Then,
(SourceType*)to.data()is a C-style cast, which are discouraged for a reason, especially in generic code like this: Depending on the typesSourceTypeandDestTypethis can have completely different meaning.If
SourceTypeis for example a base class ofDestType, then the cast will be astatic_castand the result will be a pointer to theSourceTypebase class subobject. This does in general change the address of the pointer and may fail to compile if the base class is inaccessible in the context (i.e. aprivatebase class). Accessing the resulting pointer is generally fine, however doing pointer arithmetic on it (as your span probably is) would be UB, because the array into which the pointer points is an array ofDestTypeobjects, notSourceTypeobjects. It is UB to do pointer arithmetic with a base class type into a derived class array.If
SourceTypeis a derived class ofDestType, then the cast itself will still be astatic_cast, but will have undefined behavior, because it would try to downcast aDestTypeobject to its derivedSourceTypeobject which doesn't exist. An exception to this applies if implicit object creation applies as detailed below.If there is no such base class relation, then the cast will end up as a
reinterpret_cast, which does generally not change the address.Generally, a
reinterpret_castalso doesn't change to which object the pointer points, e.g. withSourceType = floatandDestType = int,(SourceType*)to.data()will point to aDestTypeobject, not aSourceTypeobject. In such a situation the aliasing rule as well as rules for e.g. member access expressions apply and will make almost any use of the resulting pointer UB with very few exceptions.An exception where a
reinterpret_castdoes change the pointer value to point to a different object is if there is an object of the target type that is pointer-interconvertible with the original object. That applies for example to first non-static data members of standard-layout classes without base classes. In that case aliasing can't be an issue because the resulting pointer will point to the actual subobject matching the type of the pointer. However, pointer arithmetic will still be faced with the exact same UB problem as stated above forstatic_cast.Furthermore a
reinterpret_castcan result in an unspecified value (which then causes UB on access or use in most expressions) if the alignment of the address isn't sufficient for theSourceType, which can happen if e.g.SourceTypehas a stricter alignment requirement thanDestType.In either case however, the cast itself will not cause UB (with the mentioned exception).
Additional note: If
SourceTypeis an implicit-lifetime typememcpycould implicitly createSourceTypeobjects in theto.data()storage, ending the lifetime of the previousDestTypeobjects (and potentially the wholeMyArrayobject). In that case there would (assuming the alignment is not a problem as stated above) be no issue with the cast toSourceType(except for a missingstd::launder). However accessing thetoelements asDestTypelater, or potentially usingtoat all, would then cause UB for accessing out-of-lifetime objects (or cause UB when doing pointer arithmetic in the case thatDestTypeis a base ofSourceType). That's probably not the intended use case.To simplify all of this: You can't have a memory region be used as
SourceTypeandDestTypeat the same time if they are different in more than cv-qualifications, with very few exceptions and practically no exception if you also want to do pointer arithmetic on the memory region in both types at the same time.All of the above is based strictly on what the standard says is or isn't UB. Practically speaking for example I do not expect any compiler to behave unexpected when doing pointer arithmetic in the wrong type as long as the types have the same size and alignment requirement. However violations of the aliasing rule will cause problems in practice.