For a given MaxDegreeOfParallelism and fixed amount of objects that need to be processed (i.e. have certain code executed on them) it would seem Parallel.ForEach and an ActionBlock would be equally useful.
What considerations would need to be taken into account when choosing one over the other?
Yes, both the
Parallel.ForEachand theActionBlock<T>can be used for the purpose of processing a list of items in parallel. Between these two theParallel.ForEachis the more natural choise, because it communicates clearly its purpose and requires less study before using it. Both have gotchas that might catch you by surprise. Here are some things that you should have in mind:The
Parallel.ForEachprocessed the items in an order that depends on the type of thesource. If it's a list or array the order will be quite peculiar, because theParallel.ForEachwill partition the list in ranges and will assign a worker task for each range (range partitioning). So you'll see the items to be processed like this: 1, 26, 51, 76, 2, 27, 52, 77..., instead of the quasi-sequential 1, 2, 4, 3, 5, 8, 6, 7 etc. If the source is anIEnumerable<T>, then the order will be the natural start-to-end. TheActionBlock<T>processes the items in the order that youPostthem, so there are no surprises there.When the
sourceis anIEnumerable<T>, theParallel.ForEachuses chunk partitioning by default, meaning that it doesn't grab just one item from thesourceat a time. It accumulates items in small chunks, and then starts processing them. This might catch you by surprise if for example yoursourceis aBlockingCollection<T>. You will add an item in the collection and theParallel.ForEachwon't process it immediately, and you'll wonder why. TheActionBlock<T>takes items for its own buffer one-by-one, so no surprises there.The
ActionBlock<T>by default hasMaxDegreeOfParallelism = 1(i.e. no parallelism). On the contrary theParallel.ForEachby default hasMaxDegreeOfParallelism = -1(i.e. unlimited parallelism). TheParallel.ForEachhas by far the most dangerous default, because if you forget to configure theMaxDegreeOfParallelismit will quickly saturate yourThreadPool. With a saturatedThreadPool, other concurrent operations of your program will stutter.The
ActionBlock<T>has the annoying "by design" behavior of swallowing anyOperationCancelledExceptions thrown by theaction. So if processing an item can fail with anOperationCancelledException, i.e. if this exception denotes failure instead of cancellation, theActionBlock<T>will complete happily with no exception like nothing happened, hiding from you that actually the processing of some items failed.The peculiarities of the
Parallel.ForEachthat I mentioned earlier can be fixed easily by wrapping thesourcein an appropriatePartitioner, as shown in this answer. A more drastical fix is to switch to the newerParallel.ForEachAsyncAPI. Although theParallel.ForEachAsynchasAsyncin its name, it can process synchronous workloads just as easily and efficiently. Just return aValueTask.CompletedTaskfrom thebody, andWaitthe resultingTask. TheParallel.ForEachAsyncemploys no surprising chunking/partitioning strategies (not documented though). It processes the items in the natural start-to-end order. It is also less aggressive with owning theThreadPool, since by default it hasMaxDegreeOfParallelismequal toEnvironment.ProcessorCount, which is a sensible default for most scenarios. Its worker tasks are synchronized asynchronously when they take items from thesource, so if the source is an emptyBlockingCollection<T>only one thread will be blocked. It lacks some functionality that theParallel.ForEachhas, like breaking and getting theLowestBreakIteration, but these features are rarely used in practice.