ImmutableHashSet .Contains returns false

1.2k Views Asked by At

I have a list (to be precise ImmutableHashSet<ListItem> from System.Collections.Immutable) of base items and try to call the following code

_baseList.Contains(derivedItem)

but this returns false.

Even though the following code lines all return true

object.ReferenceEquals(_baseList.First(), derivedItem)
object.Equals(_baseList.First(), derivedItem)
_baseList.First().GetHashCode() == derivedItem.GetHashCode()

I can even write the following and it returns true:

_baseList.OfType<DerivedClass>().Contains(derivedItem)

What am I doing wrong, I would like to avoid writing the .OfType stuff.

Edit:

private ImmutableHashSet<BaseClass> _baseList;

public class BaseClass
{

}

public class DerivedClass : BaseClass
{

}

public void DoStuff()
{
    var items = _baseList.OfType<DerivedClass>().ToList();
    foreach (var derivedItem in items)
    {
        RemoveItem(derivedItem);
    }
}

public void RemoveItem(BaseClass derivedItem)
{
    if (_baseList.Contains(derivedItem))
    {
        //doesn't reach this place, since _baseList.Contains(derivedItem) returns false...
        _baseList = _baseList.Remove(derivedItem);
    }

    //object.ReferenceEquals(_baseList.First(), derivedItem) == true
    //object.Equals(_baseList.First(), derivedItem) == true
    //_baseList.First().GetHashCode() == derivedItem.GetHashCode() == true
    //_baseList.OfType<DerivedClass>().Contains(derivedItem) == true
}

Edit2:

Here a reproducible code of my problem, seems like ImmutableHashSet<> caches GetHashCode and doesn't compare the current GetHashCode with the entries inside the list, is there a way to tell ImmutableHashSet<> that the GetHashCode of the items could be different, atleast for the item I am currently checking since hey its the damn same reference...

namespace ConsoleApplication1
{
    class Program
    {
        private static ImmutableHashSet<BaseClass> _baseList;

        static void Main(string[] args)
        {
            _baseList = ImmutableHashSet.Create<BaseClass>();
            _baseList = _baseList.Add(new DerivedClass("B1"));
            _baseList = _baseList.Add(new DerivedClass("B2"));
            _baseList = _baseList.Add(new DerivedClass("B3"));
            _baseList = _baseList.Add(new DerivedClass("B4"));
            _baseList = _baseList.Add(new DerivedClass("B5"));

            DoStuff();
            Console.WriteLine(_baseList.Count); //output is 5 - put it should be 0...
            Console.ReadLine();
        }

        private static void DoStuff()
        {
            var items = _baseList.OfType<DerivedClass>().ToList();
            foreach (var derivedItem in items)
            {
                derivedItem.BaseString += "Change...";
                RemoveItem(derivedItem);
            }
        }

        private static void RemoveItem(BaseClass derivedItem)
        {
            if (_baseList.Contains(derivedItem))
            {
                _baseList = _baseList.Remove(derivedItem);
            }
        }
    }

    public abstract class BaseClass
    {
        private string _baseString;
        public string BaseString
        {
            get { return _baseString; }
            set { _baseString = value; }
        }

        public BaseClass(string baseString)
        {
            _baseString = baseString;
        }

        public override int GetHashCode()
        {
            unchecked
            {
                int hashCode = (_baseString != null ? _baseString.GetHashCode() : 0);
                return hashCode;
            }
        }
    }
    public class DerivedClass : BaseClass
    {
        public DerivedClass(string baseString)
            : base(baseString)
        {

        }
    }
}

If I would change the ImmutableHashSet<> to ImmutableList<> the code works fine, so if you guys don't come up with any good idea I will switch to the list.

2

There are 2 best solutions below

3
On BEST ANSWER

Objects that are used in dictionaries and other hashing-related data structures should have immutable identity - all hashing-related data structures assume that once you add the object to the dictionary, its hashcode is not going to change.

This code is not going to work:

    private static void DoStuff()
    {
        var items = _baseList.OfType<DerivedClass>().ToList();
        foreach (var derivedItem in items)
        {
            derivedItem.BaseString += "Change...";
            RemoveItem(derivedItem);
        }
    }

    private static void RemoveItem(BaseClass derivedItem)
    {
        if (_baseList.Contains(derivedItem))
        {
            _baseList = _baseList.Remove(derivedItem);
        }
    }

_baseList.Contains() in RemoveItem(), as called by DoStuff() is going to return false for every single item, because you changed the identity of the stored item - its BaseString property.

0
On

I think you answered your own question in your edit. You can't have the hashCode change once you've added the item to the HashSet. That breaks the contract of how a HashSet works.

See this excellent article by Eric Lippert for more information on the topic.

In particular, it says the following:

Guideline: the integer returned by GetHashCode should never change

Ideally, the hash code of a mutable object should be computed from only fields which cannot mutate, and therefore the hash value of an object is the same for its entire lifetime.

However, this is only an ideal-situation guideline; the actual rule is:

Rule: the integer returned by GetHashCode must never change while the object is contained in a data structure that depends on the hash code remaining stable

It is permissible, though dangerous, to make an object whose hash code value can mutate as the fields of the object mutate. If you have such an object and you put it in a hash table then the code which mutates the object and the code which maintains the hash table are required to have some agreed-upon protocol that ensures that the object is not mutated while it is in the hash table. What that protocol looks like is up to you.

If an object's hash code can mutate while it is in the hash table then clearly the Contains method stops working. You put the object in bucket #5, you mutate it, and when you ask the set whether it contains the mutated object, it looks in bucket #74 and doesn't find it.

Remember, objects can be put into hash tables in ways that you didn't expect. A lot of the LINQ sequence operators use hash tables internally. Don't go dangerously mutating objects while enumerating a LINQ query that returns them!

EDIT: BTW, Your post and your subsequent edit are a perfect example of why you should always post a complete and reproducible working code of your problem from the beginning, instead of trying to filter out what you feel is irrelevant information. Pretty much anyone looking at your post an hour ago could have given you the correct answer in a split second had they had all the relevant information to begin with.