使用Dictionary和HashSet的GetHashCode方法

我有一个关于Dictionary和HashSet如何在C＃中工作的问题。根据我的理解，GetHashCode用于哈希表以确定密钥唯一性。

在以下MSDN页面上，它指出：

哈希码是一个数值，用于插入和标识基于散列的集合中的对象，例如Dictionary类，Hashtable类或从DictionaryBase类派生的类型。

链接： MSDN Object.GetHashCode

如果是这种情况，为什么当包含与car1相同的哈希码时，ContainsKey和Contains会为car2返回false？如果我的理解是正确的，如果MSDN说的是正确的，那么这两个都不应该返回真的吗？

class Program { static void Main(string[] args) { // Create a Dictionary and HashSet Dictionary carDictionary = new Dictionary(); HashSet carSet = new HashSet(); // Create 3 Cars (2 generic and 1 Civic) Car car1 = new Car(); Car car2 = new Car(); Car car3 = new Civic(); // Test hash values int test1 = car1.GetHashCode(); // 22008501 int test2 = car2.GetHashCode(); // 22008501 int test3 = car3.GetHashCode(); // 12048305 // Add 1 generic car and 1 Civic to both Dictionary and HashSet carDictionary.Add(car1, 1); carDictionary.Add(car3, 1); carSet.Add(car1); carSet.Add(car3); // Why are both of these false? bool dictTest1 = carDictionary.ContainsKey(car2); // false bool setTest1 = carSet.Contains(car2); // false // Testing equality makes sense bool testA = car1.Equals(car2); // false bool testB = car1.Equals(car3); // false } } class Car { public override int GetHashCode() { return 22008501; } } class Civic : Car { public override int GetHashCode() { return 12048305; } }

因为ContainsKey的逻辑与此类似。

 //This is a simplified model for answering the OP's question, the real one is more complex. private List>> _buckets = //.... public bool ContainsKey(TKey key) { List> bucket = _buckets[key.GetHashCode() % _buckets.Length]; foreach(var item in bucket) { if(key.Equals(item.Key)) return true; } return false; }

所有GetHashCode都会获得你的密钥进入的存储桶，它仍然必须通过该存储桶的每个成员并通过Equals方法找到完全匹配。这就是为什么拥有良好的哈希码很重要，一个桶中的项目越少， foreach部分就越快。最佳可能的哈希码每个桶只有一个项目。

这是HashSet上包含的实际代码（Dictionary的ContainsKey非常相似，但更复杂）

 private int[] m_buckets; private Slot[] m_slots; public bool Contains(T item) { if (m_buckets != null) { int hashCode = InternalGetHashCode(item); // see note at "HashSet" level describing why "- 1" appears in for loop for (int i = m_buckets[hashCode % m_buckets.Length] - 1; i >= 0; i = m_slots[i].next) { if (m_slots[i].hashCode == hashCode && m_comparer.Equals(m_slots[i].value, item)) { return true; } } } // either m_buckets is null or wasn't found return false; } private int InternalGetHashCode(T item) { if (item == null) { return 0; } return m_comparer.GetHashCode(item) & Lower31BitMask; } internal struct Slot { internal int hashCode; // Lower 31 bits of hash code, -1 if unused internal T value; internal int next; // Index of next entry, -1 if last }

不必保证哈希码是唯一的，如果密钥相等则它们必须相等。

现在发生的事情是项目存储在存储桶中。如果查询Dictionary包含给定项或HashSet给定项，它将首先计算哈希码以获取正确的桶。

接下来，它将遍历存储桶中的所有项目并对其执行.Equals测试。只有在其中一个匹配的情况下，它才会返回true 。

换句话说，尽管实例不同，但允许一个为每个实例返回相同的哈希码 。它只会使散列效率低下。

C＃因此存储一个Dictionary如：

 +----------+ | 22008501 |-----------| +----------+ | 11155414 | (other bucket) +----------+

在左侧（可能的桶标签），虽然对于小型Dictionary ，桶的数量将非常小，并且将对哈希执行操作（例如模数），以使结果的数量更小。

现在，如果您查询car2是否在Dictionary ，它将计算哈希值，从而获取第一个桶。然后它会迭代，并对car1与car2 ，next car3 vs car2进行相等检查，它将到达桶的末尾并返回false 。这是因为默认的Equals操作是引用相等。只有当你override它时（例如所有汽车都是相同的，你可以返回true ）。

正如您所注意到的， car1.Equals(car2)不是真的。 Dictionary和Hashset成员资格仅适用于相同的对象。这意味着.Equals()返回true。仅在首次发现它们的哈希码相等时才进行测试。

使用Dictionary和HashSet的GetHashCode方法

区分x：Wpf应用程序中的名称和名称

应该何时将任务视为“长时间运行”？

如何检查应用程序是否可以访问目录？

在ASP.NET中加载嵌套的UserControl

在Windows中哪里可以找到默认的Winforms图标？

在XmlSerializer生成的对象中维护xml层次结构（即父子）信息

异步/等待模式会对次要工作单元造成性能损失吗？

ASP.NET Web窗体和MySqlentity framework：“不支持嵌套事务”

在选项卡控件中的选项卡页面之间淡入淡出

ldstr在内部实现newobj吗？