By-value and by-reference distinction between List<T> and Array for custom struct and built-in structs (like Int32)

90 Views Asked by At

I know there is a small but important distinction between using indexers with List and Array. Array returns the reference for the member, whereas List copies the member value. Where I can't wrap my head around is this rule works perfectly for custom structs, but not for built-in structs like Int32.

List<int> numbers;

List<Mutable> mutablesList; 
Mutable[] mutablesArray;

public struct Mutable
{
    public int X;
}

numbers[0]++; // OK
mutablesArray[0].X++; //OK
mutablesList[0].X++ //Compiler error

Unfortunately, I can't find any plausible explanation on this issue.

2

There are 2 best solutions below

0
shingo On

When you try to access mutablesList[0], .Net will call the indexer's get accessor to load the element on to the stack first. A get accessor is just a method, so it always makes a copy of that element.

When you try to access an array element, .Net has several made-to-order instructions to do the job, especially these two: ldelem and ldelema

ldelema will load the address of the element instead of the element itself.

The compiler will intelligently choose the best instruction to use.

For expression mutablesArray[0].X++ or var x = mutablesArray[0].X, the compiler chooses to use ldelema, this allows it to modify the element data.

For expression var elem = mutablesArray[0], the compiler chooses to use ldelem.

2
NetMage On

An array is a collection of variables that are named by their index into the array. So just like

Mutable a;
a.X++;

is reasonable,

mutablesArray[0].X++;

is reasonable.

On the other hand, a List<T> (or other collection) is a collection of objects (which are references for reference types, and values for value types). So

numbersList[0]++;

Is equivalent to

numbersList[0] = numbersList[0] + 1;

which is reasonable as it replaces an element of the collection.

Consider it interpreted something like

numbersArray.this_set(0, numbersArray.this_get(0) + 1);

But

mutablesList[0].X++;

attempts to modify an anonymous temporary struct that is a copy of the list member, which can be interpreted like:

mutablesList.this_get(0).X = mutablesList.this_get(0).X + 1;

This is different from what you are doing with the List<int>, where you are actually replacing the entire value item (e.g. int) with a new value item. This will also work with Mutable:

mutablesList[0] = new Mutable { X = mutablesList[0].X + 1 };

is perfectly valid and works correctly.

Consider creating a method that returns a struct:

public static T StructCopy<T>(T m) where T : struct => m;

Then using the method is analogous to referring to the returned value from indexing the List<Mutable>:

StructCopy(m).X++;
StructCopy(m).X = StructCopy(m).X + 1;

In both cases, the compiler gives an error stating the return value of StructCopy isn't a variable. Neither is the return value of List<T>.this[].

It may help to realize the compiler doesn't special case a struct with a single member. If Mutable had a second member, public int Y, then mutablesList[0].X++; is an attempt to change part of the structure by modifying the copy, which isn't useful and thus errors.