4

I've read that a HashSet in .net4 will ignore all the duplicates. So what I do is:

    HashSet<medbaseid> medbaseidlist = new HashSet<medbaseid>();

     for (int i = 2; i <= rowCount; i++)
     {
        medbaseid medbaseid = new medbaseid() { 
              mainClass = xlRange.Cells[i, 1].Value2.ToString(), 
              genName = xlRange.Cells[i, 2].Value2.ToString(),
              speciality = xlRange.Cells[i, 3].Value2.ToString(), 
              med_type_id = getId(xlRange.Cells[i, 4].Value2.ToString(),
              id = i-1
) 
        };

    medbaseidlist.Add(medbaseid);
 }

medbaseid can have the same values as the previous object.

But if I check the hashset later in the end, there are duplicate items. enter image description here

the equals and gethashcode method i added but didn't help. I also added an id to the class. So 2 objects can have the same content but different id :

   public override bool Equals(object obj)
    {
        medbaseid medb = (medbaseid)obj;
        return ((medb.id == this.id) && (medb.genName == this.genName) && (medb.mainClass == this.mainClass) && (medb.med_type_id == this.med_type_id) && (medb.speciality == this.speciality)) ? true : false;
    }

    public override int GetHashCode()
    {
        return id;
    }

So my question now is: What am I doing wrong, or is this not the right way to use a HashSet? Thanks in advance for any help.

Olivier_s_j
  • 5,490
  • 24
  • 80
  • 126

5 Answers5

11

It will depend on the implementations of GetHashCode() and Equals() on the medbaseid class.

See http://msdn.microsoft.com/en-us/library/system.object.gethashcode.aspx for more info.

By default objects will only compare as equal if they are literally the same object. Having the same "content" is not sufficient to make them equal. If you want two different objects with the same "content" to be equal, you must override Equals() to implement that logic. Whenever you override Equals() you must also override GetHashCode() for them to work correctly inside a hashing data structure like HashSet<>.

Daniel Renshaw
  • 33,729
  • 8
  • 75
  • 94
  • 4
    It's also worth mentioning that items in a hashset should be immutable (or at least the fields considered for equality should be immutable), otherwise they might be altered after insertion into the hashset such that the hashset will contain dupes. – spender Apr 25 '11 at 13:28
  • Did this , didn't help . Did i do something wrong ? – Olivier_s_j Apr 25 '11 at 13:44
  • Your Equals method and your GetHashCode method appear to be considering different sets of fields. This will produce confusing results. – spender Apr 25 '11 at 13:57
  • @Ojtwist You shouldn't need to add an ID for this purpose. Of your original four fields, which combination of those values provides a unique value for each object? Even if it's only with the combination of all four values, you need to use the same combination in both `Equals()` and `GetHashCode()`. – Daniel Renshaw Apr 25 '11 at 14:04
  • @Daniel Renshaw the GethashCode returns an int , so i can't just concatinate my 4 values to make a unique id. Can i ? – Olivier_s_j Apr 25 '11 at 14:32
  • @Ojtwist just call `GetHashCode()` on all objects (to get an integer representation for them) and do something like the answer to this question: http://stackoverflow.com/questions/2320808/overloading-gethashcode-and-the-equality-operator-using-the-xor-operator-on-enums – Daniel Renshaw Apr 25 '11 at 14:34
2

For Hashset<medbaseid> to work properly, either medbaseid must be a struct or you have to define a field based equality on your class medbaseid by overriding Equals() and GetHashCode(). Alternatively you can pass in a custom IEqualityComparer when you create the Hashet.

BrokenGlass
  • 158,293
  • 28
  • 286
  • 335
1

Did you override GetHashCode (and Equals)? In the standard implementation, different objects have different hashcodes, even if all the properties are equal.

Femaref
  • 60,705
  • 7
  • 138
  • 176
1

Sounds like you need to implement GetHashCode and equality members. Eric Lippert has an excellent post on this subject.

spender
  • 117,338
  • 33
  • 229
  • 351
1

Bear in mind that equality is in the eye of the beholder. Specifically, in order to be considered equal, two objects must possess the same hash code as returned by GetHashCode and must return true for Equals (two virtual/overridable methods found on the object base class).

In the case of a HashSet, you can also specify a custom equality comparer in the constructor that performs equality comparison and hash code generation.

http://msdn.microsoft.com/en-us/library/bb359100.aspx

Point being, the likely cause for your problem is that your medbaseids...although equal in the values of their members, are not actually equal by hash code and Equals. The default behavior for Equals and hash code is based on object reference equality (actually the same instance of the object).

Override Equals and GetHashCode on medbaseid. Or define an IEqualityComparer<medbaseid> that does the comparison and specify it in the constructor for your HashSet.

Jeff
  • 35,755
  • 15
  • 108
  • 220
  • Added the equals and gethashcode, but i still have duplicates – Olivier_s_j Apr 25 '11 at 13:43
  • It sounds like you're saying the id value is different, correct? If so, you're going to end up with different hash codes and thus duplicate entries... – Jeff Apr 25 '11 at 14:50