TArray Result not always initially () within for loop?

143 Views Asked by At

Result in Test is NOT always initially ()

I have found Do I need to setLength a dynamic array on initialization?, however I do not fully understand that answer

More importantly, what is the best approach to "break" the Result/A connection (I need the loop)? Perhaps some way to force the compiler to "properly" initialize? Manually adding Result := nil as first line in Test?

function Test(var A: TArray<Integer>): TArray<Integer>;
begin
  SetLength(Result, 3); // Breakpoint here
  Result[0] := 2;
end;

procedure TForm1.Button3Click(Sender: TObject);
var
  A: TArray<Integer>; // Does not help whether A is local or global
  I: Integer;
begin
  for I := 1 to 3 do
    A := Test(A); // At Test breakpoint:
                  // * FIRST loop: Result is ()
                  // * NEXT loops: Result is (2, 0, 0)
                  //               modifying Result changes A (immediately)
  A := Test(A);   // Result is again ()
end;
2

There are 2 best solutions below

4
hundreAd On BEST ANSWER

While I hesitate to call this For compiler optimization a bug, this is certainly unhelpful if modifying array elements directly:

function Test(var A: TArray<Integer>): TArray<Integer>;
begin
  if Length(Result) > 0 then // Breakpoint
    Result[1] := 66; // A modified!
  SetLength(Result, 3);
  Result[0] := Result[0] + 1; // A not modified
  Exit;
  A[9] := 666; // Force linker not to eliminate A
end;

After investigation, I conclude that functions that affect the entire array (e.g. SetLength, Copy or some other function that returns TArray<Integer>) will -- unsurprisingly -- "break" the Result/A identicality created by the For loop.

It would appear that the safest approach is (as per the answer linked to in the original post) to Result := nil; as first line in Test.

If there are no further suggestions, I will eventually accept this as the answer.

NOTE: As an added bonus, starting with Result := nil prevents the array from being copied by SetLength -- obvious, but for e.g. an array of 100000 being looped 100000 times this little modification effectuates a ~40% faster execution time

5
Stefan Glienke On

The referenced question is about fields inside of a class and they are all zero-initialized and managed types are properly finalized during instance destruction.

Your code is about calling a function with a managed return type within the loop. A local variable of a managed type is initialized once - at the beginning of the routine. A managed return type under the hood is treated by the compiler as a var parameter. So after the first call, it passes what looks to be A to Test twice - as the A parameter and for the Result.

But your assessment that modifying Result also affects A (the parameter) is not correct which we can prove by changing the code a bit:

function Test(var A: TArray<Integer>; I: Integer): TArray<Integer>;
begin
  SetLength(Result, 3); // Breakpoint here
  Result[0] := I;
end;

procedure Main;
var
  A: TArray<Integer>;
  I: Integer;
begin
  for I := 1 to 3 do
    A := Test(A, I);
                    
  A := Test(A, 0);
end;

When you single step through Test you will see that changing Result[0] will not change A. That is because SetLength will create a copy because the compiler introduced a second variable it uses temporarily for passing Result and after the call to Test it assigns that to A (the local variable) - you can see that in the disassembly view which will look similar to this for the line in the loop (I use $O+ to make the code a little denser than it would be without optimization):

Project1.dpr.21: A := Test(A, I);
0041A3BD 8D4DF8           lea ecx,[ebp-$08]
0041A3C0 8D45FC           lea eax,
0041A3C3 8BD3             mov edx,ebx
0041A3C5 E8B2FFFFFF       call Test
0041A3CA 8B55F8           mov edx,[ebp-$08]
0041A3CD 8D45FC           lea eax,[ebp-$04]
0041A3D0 8B0DC8244000     mov ecx,[$004024c8]
0041A3D6 E855E7FEFF       call @DynArrayAsg
0041A3DB 43               inc ebx

Knowing the default calling convention is first three parameters in eax, edx, and ecx, we know eax is the A parameter, edx is I and ecx is Result (the aforementioned Result var parameter is always last). We see that it uses different locations on the stack ([ebp-$04] which is the A variable and [ebp-$08] which is the compiler introduced variable). And after the call we see that the compiler inserted an additional call to System._DynArrayAsg which then assigns the compiler introduced temp variable for Result to A.

Here is a screenshot of the second call to Test:

enter image description here