I'm trying to import data from a CSV file, unfortunately there is no primary key that would allow me to uniquely identify a given row. So I created a dictionary in which the key is the value that GetHashCode returns to me. I use the dictionary because its search is much faster than searching with linq and where with conditions for several properties.
My GetHashCode override looks like this:
public override int GetHashCode()
{
unchecked
{
int hash = 17;
hash = hash * 23 + this.Id.GetHashCode();
hash = hash * 23 + this.Author?.GetHashCode() ?? 0.GetHashCode();
hash = hash * 23 + this.Activity?.GetHashCode() ?? 0.GetHashCode();
hash = hash * 23 + this.DateTime?.GetHashCode() ?? 0.GetHashCode();
return hash;
}
}
After fetching data from DB I do:
.ToDictionary(d => d.GetHashCode());
And here comes the problem, I checked the database and I don't have any duplicates when it comes to these four parameters. But when running the import I often get an error that the given key already exists in the dictionary, but if I run the import again for the same data the next time everything runs fine.
How can I fix this error? The import application is written in .net 5
Id - long
Author, Activity - string
DateTime - DateTime?
Unfortunately, this ID is more like FK is not unique, there may be many rows with the same id, author, activity, but e.g. a different datetime
GetHashCode()does NOT produce unique values, so using it as a key in a dictionary can give you the errors that you have observed.You should implement
GetHashCode()ANDIEquatable<T>for your key type. Then you will be able to safely put instances of it into a hashing container, so long as there are no duplicate entries. (Itemsxandywill only be considered duplicates if theGetHashCode()values are the same ANDx.Equals(y)returnstrue).So for example, your data key class could look like this:
That's a lot of boilerplate code. Fortunately, if you are using a fairly recent version of C#/.NET you can use the
recordtype to simplify this to just:The
recordtype implementsIEquatable<T>andGetHashCode()correctly for you (for the specific typeslong,string?andDateTime?).Note that both the example types above are immutable. It's very important when using hashing containers that the properties of a key that contribute to
GetHashCode()andEquals()are immutable. If you put an item in a hashing container and then change any of those properties, nasty things happen.