Can anyone explain this enumerator syntax?

286 Views Asked by At
public static IEnumerable<T> Pipe<T>(this IEnumerable<T> source, Action<T> action)
{    
    return _(); IEnumerable <T> _()
    {
        foreach (var element in source)
        {
            action(element);
            yield return element;
        }
    }
}

I've found this code in MoreLinq repo and can't understand this line:

return _(); IEnumerable <T> _()
2

There are 2 best solutions below

0
On BEST ANSWER

I am the maintainer of MoreLINQ. Below, I am quoting from a pull request that will give you the background behind the use of local functions:

The purpose of this PR is to refactor the private iterator methods of all operators (where possible) as local functions (introduced with C# 7). The split was needed up until now in order to have arguments checked eagerly, when the operator method is invoked and not when the iterator is used the first time (which may be far from the call site). The same split can now be done via a local function with the added benefit of code simplification like:

  • parameters of parent method are in scope so the signature of the actual iterator implementation method becomes simpler since there is no need to pass on all arguments.
  • type parameters of parent method are in scope so don't need repeating.
  • there is one less top-level method to write.
  • iterator body appears in-line to the actual public operator method that uses it.
  • there is no need for debug-build assertions for arguments that have already been checked

To answer on the choice of the style return _(); IEnumerable <T> _(), I'm going to quote the rationale I provided in pull request #360 to the project:

Putting the return statement and the local function declaration on a single line is designed to compensate for the fact that the language doesn't support anonymous iterators succinctly. There is no extra information or context being provided on that line so there's no clarity gained by separating the two except that it might appear a little unorthodox in styling. Just consider how many things are immaterial:

  • The name of the local iterator function so it is given the bogus name of _.
  • The return type of the local function because it's redundant with the outer method's return type.
  • The call to the local function never really executes because the iterator function becomes lazy so it's even somewhat misleading to highlight the call on its own.

What's in fact being returned is an iterator object with the body of its algorithm and so the style of putting it all on a single line is designed to make it appear just as that.

The origin and play on the styling come from the idea that…

If you squint hard enough, you can almost believe that C# 7 now has anonymous iterators

Anonymous Iterators in C# 7, Almost

See also some examples in #291.

4
On

This code uses a relatively new feature of C#, called local function. The only unusual thing about this function is its name: developers used a single underscore for it. Hence, the name of the function is _, so the invocation looks like this: _()

Now that you know that return statement returns the result of invoking a local function named _, the rest of the syntax falls into place:

// This is a local function
IEnumerable <T> _() {
    ...
}

(OP's comment on the question) Can't we just do foreach with yield return?

The method that you copied included two additional lines, which are key to understanding the difference:

public static IEnumerable<T> Pipe<T>(this IEnumerable<T> source, Action<T> action)
{    
    if (source == null) throw new ArgumentNullException(nameof(source));
    if (action == null) throw new ArgumentNullException(nameof(action));
    return _(); IEnumerable <T> _()
    {
        foreach (var element in source)
        {
            action(element);
            yield return element;
        }
    }
}

If you put foreach with yield return directly into the body of Pipe<T> method, argument checking would be deferred until you start iterating your IEnumerable<T> result. With local function in place you would do the check as soon as Pipe<T> is called, even in situations when the caller never iterates the result.