I am looking at migrating from Dictionary to ConcurrentDictionary for a multi thread environment.
Specific to my use case, a kvp would typically be <string, List<T>>
- What do I need to look out for?
- How do I implement successfully for thread safety?
- How do I manage reading key and values in different threads?
- How do I manage updating key and values in different threads?
- How do I manage adding/removing key and values in different threads?
The
ConcurrentDictionary<TKey,TValue>collection is surprisingly difficult to master. The pitfalls that are waiting to trap the unwary are numerous and subtle. Here are some of them:ConcurrentDictionary<TKey,TValue>blesses with thread-safery everything it contains. That's not true. If theTValueis a mutable class, and is allowed to be mutated by multiple threads, it can be corrupted just as easily as if it wasn't contained in the dictionary.ConcurrentDictionary<TKey,TValue>with patterns familiar from theDictionary<TKey,TValue>. Race conditions can trivially emerge. For exampleif (dict.Contains(x)) list = dict[x]is wrong. In a multithreaded environment it is entirely possible that the key x will be removed between thedict.Contains(x)and thelist = dict[x], resulting in aKeyNotFoundException. TheConcurrentDictionary<TKey,TValue>is equiped with special atomic APIs that should be used instead of the previous chatty check-then-act pattern.Count == 0for checking if the dictionary is empty. TheCountproperty is very cheep for aDictionary<TKey,TValue>, and very expensive for aConcurrentDictionary<TKey,TValue>. The correct property to use is theIsEmpty.AddOrUpdatemethod can be safely used for updating a mutableTValueobject. This is not a correct assumption. The "Update" in the name of the method means "update the dictionary, by replacing an existing value with a new value". It doesn't mean "modify an existing value".ConcurrentDictionary<TKey,TValue>will yield the entries that were stored in the dictionary at the point in time that the enumeration started. That's not true. The enumerator does not maintain a snapshot of the dictionary. The behavior of the enumerator is not documented precisely. It's not even guaranteed that a single enumeration of aConcurrentDictionary<TKey,TValue>will yield unique keys. In case you want to do an enumeration with snapshot semantics you must first take a snapshot explicitly with the (expensive)ToArraymethod, and then enumerate the snapshot. You might even consider switching to anImmutableDictionary<TKey,TValue>, which is exceptionally good at providing these semantics.ConcurrentDictionary<TKey,TValue>s interfaces is safe. This is not the case. For example theToArraymethod is safe because it's a native method of the class. TheToListis not safe because it is a LINQ extension method on theIEnumerable<KeyValuePair<TKey,TValue>>interface. This method internally first calls theCountproperty of theICollection<KeyValuePair<TKey,TValue>>interface, and then theCopyToof the same interface. In a multithread environment theCountobtained by the first operation might not be compatible with the second operation, resulting in either anArgumentException, or a list that contains empty elements at the end.In conclusion, migrating from a
Dictionary<TKey,TValue>to aConcurrentDictionary<TKey,TValue>is not trivial. In many scenarios sticking with theDictionary<TKey,TValue>and adding synchronization around it might be an easier (and safer) path to thread-safety. IMHO theConcurrentDictionary<TKey,TValue>should be considered more as a performance-optimization over a synchronizedDictionary<TKey,TValue>, than as the tool of choice when a dictionary is needed in a multithreading scenario.