I tried to implement custom Linq Chunk function and found this code example
This function should separate IEnumerable into IEnumerable of concrete size
public static class EnumerableExtentions
{
public static IEnumerable<IEnumerable<T>> Batch<T>(this IEnumerable<T> source, int size)
{
using (var enumerator = source.GetEnumerator())
{
while (enumerator.MoveNext())
{
int i = 0;
IEnumerable<T> Batch()
{
do yield return enumerator.Current;
while (++i < size && enumerator.MoveNext());
}
yield return Batch();
}
}
}
}
So, I have a question.Why when I try to execute some Linq operation on the result, they are incorrect? For example:
IEnumerable<int> list = Enumerable.Range(0, 10);
Console.WriteLine(list.Batch(2).Count()); // 10 instead of 5
I have an assumption, that it happens because inner IEnumerable Batch() is only triggered when Count() is called, and something goes wrong there, but I don't know what exactly.
It's the opposite. The inner
IEnumerableis not consumed, when you callCount.Countonly consumes the outerIEnumerable, which is this one:So what
Countwould do is just move the enumerator to the end, and counts how many times it moved it, which is 10.Compare that to how the author of this likely have intended this to be used:
I'm also consuming the inner
IEnumerables using an inner loop, hence running the code inside the innerBatch. This yields the current element, then also moves the source enumerator forward. It yields the current element again before the++i < sizecheck fails. The outer loop is going to move forward the enumerator again for the next iteration. And that is how you have created a "batch" of two elements.Notice that the "enumerator" (which came from
someEnumerable) in the previous paragraph is shared between the inner and outerIEnumerables. Consuming either the inner or outerIEnumerablewill move the enumerator, and it is only when you consume both the inner and outerIEnumerables in a very specific way, does the sequence of things in the previous paragraph happen, leading to you getting batches.In your case, you can consume the inner
IEnumerables by callingToList:While sharing the enumerator here allows the batches to be lazily consumed, it limits the client code to only consume it in very specific ways. In the .NET 6 implementation of
Chunk, the batches (chunks) are eagerly computed as arrays:You can do a similar thing in your
Batchby callingToArray()here:so that the inner
IEnumerables are always consumed.