I need a lambda that applies the (negative) discrete Laplace operator (matrix) to a contiguous memory container (vector) like std::array or std::vector
Can it be undefined behavior to write it using std::transform incrementing and decrementing pointers like this?
auto A = [n,&h2](const auto & in, auto & out)
{
// First line of the matrix
out[0] = (2.*in[0] - in[1])/h2 ;
// Middle lines of the matrix
std::transform(std::execution::par_unseq,
std::next(in.cbegin()),std::next(in.cend(),-1),std::next(out.begin()),
[&h2](const auto & val)
{
return (-*(std::next(&val,-1)) + 2.*val - *(std::next(&val)))/h2;
});
// Final line of the matrix
out[n-1] = (-in[n-2] + 2.*in[n-1])/h2;
};
In other words, can algorithms like for_each and transform and other parallel algorithms break contiguous memory for some execution policy?
Edit 1: Note that I don't care which element of the vector in I am processing first, what I do care is if when I do *std::next(&val) inside the lambda in std::transform I do obtain the next element in the vector in and not something undefined.
Edit 2: I'm thinking of policies that would imply copying values somewhere else (for instance a SIMD register) and execute the lambda there before bringing back the result. Is there a condition in the standard on the execution policy or on the parallel algorithm that prevents that?

No this is not safe [algorithms.parallel.exec]p3:
The solution given in the notes is to wrap the iterators: