在 VB 中重写 GetHashCode 而不支持选中/未选中的关键字?

发布于 2024-10-11 12:17:42 字数 3172 浏览 6 评论 0 原文

因此,我试图找出如何在 VB 中为大量自定义对象正确重写 GetHashCode()。经过一番搜索,我找到了 这个精彩的答案

但有一个问题:VB 缺少 .NET 4.0 中的 checkedunchecked 关键字。据我所知,无论如何。因此,使用 Jon Skeet 的实现,我尝试在一个相当简单的类上创建这样的重写,该类具有三个主要成员:Name As StringValue As Int32[类型] 作为 System.Type。因此我想出了:

Public Overrides Function GetHashCode() As Int32
    Dim hash As Int32 = 17

    hash = hash * 23 + _Name.GetHashCode()
    hash = hash * 23 + _Value
    hash = hash * 23 + _Type.GetHashCode()
    Return hash
End Function

问题:Int32 对于像这样的简单对象来说也太小了。我测试的特定实例将“Name”作为一个简单的 5 个字符的字符串,并且该散列本身就足够接近 Int32 的上限,当它尝试计算散列的第二个字段(值)时,它会溢出。因为我找不到 VB 等效项来提供精细的 checked/unchecked 支持,所以我无法解决这个问题。

我也不想删除整个项目中的整数溢出检查。这件事可能已经完成了……40%(我编的,TBH),而且我还有很多代码要写,所以我需要在相当长的一段时间内进行这些溢出检查。

对于 VB 和 Int32,Jon 的 GetHashCode 版本的“安全”版本是什么?或者,.NET 4.0 中是否有 checked/unchecked ,而我在 MSDN 上不容易找到?


编辑:
根据链接的SO问题,最底部不受欢迎的答案提供了一个解决方案。我说准是因为感觉就像是……作弊。但乞丐不能挑挑拣拣,对吧?

从 C# 转换为更具可读性的 VB 并与上述对象(名称、值、类型)对齐,我们得到:

Public Overrides Function GetHashCode() As Int32
    Return New With { _
        Key .A = _Name, _
        Key .B = _Value, _
        Key .C = _Type
     }.GetHashCode()
End Function

这显然会触发编译器通过生成匿名类型来“作弊”,然后在项目外部对其进行编译命名空间,大概禁用了整数溢出检查,并允许进行数学计算并在溢出时简单地绕回。它似乎还涉及 box 操作码,我知道这会影响性能。不过没有拆箱。

但这提出了一个有趣的问题。我无数次在此处和其他地方看到 VB 和 C# 生成相同的 IL 代码。显然不是 100% 的情况都是如此...就像使用 C# 的 unchecked 关键字只会导致发出不同的操作码。那么,为什么我会继续看到两者产生完全相同的 IL 的假设不断重复?  

无论如何,我宁愿找到一个可以在每个对象模块中实现的解决方案。从 ILDASM 的角度来看,必须为每个对象创建匿名类型会显得很混乱。当我说我的项目中实现了很多类时,我不是在开玩笑。


EDIT2:我确实在 MSFT Connect 上发现了一个错误,VB PM 结果的要点是他们会考虑它,但不要屏住呼吸: https://connect.microsoft.com /VisualStudio/feedback/details/636564/checked-unchecked-keywords-in-visual-basic

快速浏览一下 .NET 4.5 中的变化表明他们还没有考虑过它,所以也许是 .NET 5?

我的最终实现符合 GetHashCode 的限制,同时对于 VB 来说仍然快速且足够独特,如下所示,源自 此页面

'// The only sane way to do hashing in VB.NET because it lacks the
'// checked/unchecked keywords that C# has.
Public Const HASH_PRIME1 As Int32 = 4
Public Const HASH_PRIME2 As Int32 = 28
Public Const INT32_MASK As Int32 = &HFFFFFFFF

Public Function RotateHash(ByVal hash As Int64, ByVal hashcode As Int32) As Int64
    Return ((hash << HASH_PRIME1) Xor (hash >> HASH_PRIME2) Xor hashcode)
End Function

我还认为“Shift-Add-XOR”哈希也可能适用,但我还没有测试过。

So I'm trying to figure out how to correctly override GetHashCode() in VB for a large number of custom objects. A bit of searching leads me to this wonderful answer.

Except there's one problem: VB lacks both the checked and unchecked keyword in .NET 4.0. As far as I can tell, anyways. So using Jon Skeet's implementation, I tried creating such an override on a rather simple class that has three main members: Name As String, Value As Int32, and [Type] As System.Type. Thus I come up with:

Public Overrides Function GetHashCode() As Int32
    Dim hash As Int32 = 17

    hash = hash * 23 + _Name.GetHashCode()
    hash = hash * 23 + _Value
    hash = hash * 23 + _Type.GetHashCode()
    Return hash
End Function

Problem: Int32 is too small for even a simple object such as this. The particular instance I tested has "Name" as a simple 5-character string, and that hash alone was close enough to Int32's upper limit, that when it tried to calc the second field of the hash (Value), it overflowed. Because I can't find a VB equivalent for granular checked/unchecked support, I can't work around this.

I also do not want to remove Integer overflow checks across the entire project. This thing is maybe....40% complete (I made that up, TBH), and I have a lot more code to write, so I need these overflow checks in place for quite some time.

What would be the "safe" version of Jon's GetHashCode version for VB and Int32? Or, does .NET 4.0 have checked/unchecked in it somewhere that I'm not finding very easily on MSDN?

EDIT:
Per the linked SO question, one of the unloved answers at the very bottom provided a quasi-solution. I say quasi because it feels like it's....cheating. Beggars can't be choosers, though, right?

Translated from from C# into a more readable VB and aligned to the object described above (Name, Value, Type), we get:

Public Overrides Function GetHashCode() As Int32
    Return New With { _
        Key .A = _Name, _
        Key .B = _Value, _
        Key .C = _Type
     }.GetHashCode()
End Function

This triggers the compiler apparently to "cheat" by generating an anonymous type, which it then compiles outside of the project namespace, presumably with integer overflow checks disabled, and allows the math to take place and simply wrap around when it overflows. It also seems to involve box opcodes, which I know to be performance hits. No unboxing, though.

But this raises an interesting question. Countless times, I've seen it stated here and elsewhere that both VB and C# generate the same IL code. This is clearly not the case 100% of the time...Like the use of C#'s unchecked keyword simply causes a different opcode to get emitted. So why do I continue to see the assumption that both produce the exact same IL keep getting repeated?  </rhetorical-question>

Anyways, I'd rather find a solution that can be implemented within each object module. Having to create Anonymous Types for every single one of my objects is going to look messy from an ILDASM perspective. I'm not kidding when I say I have a lot of classes implemented in my project.

EDIT2: I did open up a bug on MSFT Connect, and the gist of the outcome from the VB PM was that they'll consider it, but don't hold your breath:
https://connect.microsoft.com/VisualStudio/feedback/details/636564/checked-unchecked-keywords-in-visual-basic

A quick look at the changes in .NET 4.5 suggests they've not considered it yet, so maybe .NET 5?

My final implementation, which fits the constraints of GetHashCode, while still being fast and unique enough for VB is below, derived from the "Rotating Hash" example on this page:

'// The only sane way to do hashing in VB.NET because it lacks the
'// checked/unchecked keywords that C# has.
Public Const HASH_PRIME1 As Int32 = 4
Public Const HASH_PRIME2 As Int32 = 28
Public Const INT32_MASK As Int32 = &HFFFFFFFF

Public Function RotateHash(ByVal hash As Int64, ByVal hashcode As Int32) As Int64
    Return ((hash << HASH_PRIME1) Xor (hash >> HASH_PRIME2) Xor hashcode)
End Function

I also think the "Shift-Add-XOR" hash may also apply, but I haven't tested it.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

网白 2024-10-18 12:17:42

使用 Long 避免溢出:

Dim hash As Long = 17
'' etc..
Return CInt(hash And &H7fffffffL)

And 运算符可确保不会引发溢出异常。然而,这确实会在计算的哈希码中丢失一位“精度”,结果始终为正。 VB.NET 没有内置函数来避免它,但您可以使用一个技巧:

Imports System.Runtime.InteropServices

Module NoOverflows
    Public Function LongToInteger(ByVal value As Long) As Integer
        Dim cast As Caster
        cast.LongValue = value
        Return cast.IntValue
    End Function

    <StructLayout(LayoutKind.Explicit)> _
    Private Structure Caster
        <FieldOffset(0)> Public LongValue As Long
        <FieldOffset(0)> Public IntValue As Integer
    End Structure
End Module

现在您可以编写:

Dim hash As Long = 17
'' etc..
Return NoOverflows.LongToInteger(hash)

Use Long to avoid the overflow:

Dim hash As Long = 17
'' etc..
Return CInt(hash And &H7fffffffL)

The And operator ensures no overflow exception is thrown. This however does lose one bit of "precision" in the computed hash code, the result is always positive. VB.NET has no built-in function to avoid it, but you can use a trick:

Imports System.Runtime.InteropServices

Module NoOverflows
    Public Function LongToInteger(ByVal value As Long) As Integer
        Dim cast As Caster
        cast.LongValue = value
        Return cast.IntValue
    End Function

    <StructLayout(LayoutKind.Explicit)> _
    Private Structure Caster
        <FieldOffset(0)> Public LongValue As Long
        <FieldOffset(0)> Public IntValue As Integer
    End Structure
End Module

Now you can write:

Dim hash As Long = 17
'' etc..
Return NoOverflows.LongToInteger(hash)
儭儭莪哋寶赑 2024-10-18 12:17:42

这是一个结合了 Hans Passant 的答案乔恩·斯基特的回答

它甚至适用于数百万个属性(即没有整数溢出异常),并且速度非常快(为具有 1,000,000 个字段的类生成哈希码不到 20 毫秒,而对于只有 100 个字段的类几乎无法测量)。

这是处理溢出的结构:

<StructLayout(LayoutKind.Explicit)>
Private Structure HashCodeNoOverflow
    <FieldOffset(0)> Public Int64 As Int64
    <FieldOffset(0)> Public Int32 As Int32
End Structure

和一个简单的 GetHashCode 函数:

Public Overrides Function GetHashCode() As Integer

    Dim hashCode As HashCodeNoOverflow

    hashCode.Int64 = 17

    hashCode.Int64 = CLng(hashCode.Int32) * 23 + Field1.GetHashCode
    hashCode.Int64 = CLng(hashCode.Int32) * 23 + Field2.GetHashCode
    hashCode.Int64 = CLng(hashCode.Int32) * 23 + Field3.GetHashCode

    Return hashCode.Int32

End Function

或者如果您愿意:

Public Overrides Function GetHashCode() As Integer

    Dim hashCode = New HashCodeNoOverflow With {.Int32 = 17}

    For Each field In Fields
        hashCode.Int64 = CLng(hashCode.Int32) * 23 + field.GetHashCode
    Next

    Return hashCode.Int32

End Function

Here is an implementation combining Hans Passant's answer and Jon Skeet's answer.

It works even for millions of properties (i.e. no integer overflow exceptions) and is very fast (less than 20 ms to generate hash code for a class with 1,000,000 fields and barely measurable for a class with only 100 fields).

Here is the structure to handle the overflows:

<StructLayout(LayoutKind.Explicit)>
Private Structure HashCodeNoOverflow
    <FieldOffset(0)> Public Int64 As Int64
    <FieldOffset(0)> Public Int32 As Int32
End Structure

And a simple GetHashCode function:

Public Overrides Function GetHashCode() As Integer

    Dim hashCode As HashCodeNoOverflow

    hashCode.Int64 = 17

    hashCode.Int64 = CLng(hashCode.Int32) * 23 + Field1.GetHashCode
    hashCode.Int64 = CLng(hashCode.Int32) * 23 + Field2.GetHashCode
    hashCode.Int64 = CLng(hashCode.Int32) * 23 + Field3.GetHashCode

    Return hashCode.Int32

End Function

Or if your prefer:

Public Overrides Function GetHashCode() As Integer

    Dim hashCode = New HashCodeNoOverflow With {.Int32 = 17}

    For Each field In Fields
        hashCode.Int64 = CLng(hashCode.Int32) * 23 + field.GetHashCode
    Next

    Return hashCode.Int32

End Function
合久必婚 2024-10-18 12:17:42

我在 vb.net 中实现 Skeet 先生的解决方案时遇到了同样的问题。我最终使用 Mod 运算符到达那里。 Integer.MaxValue 的每个 Mod 应仅返回到该点的最不重要的分量,并且始终位于 Integer.MaxValue 和 Integer.MinValue 之内——这应该与未选中的效果相同。你可能不需要像我一样频繁地进行修改(只有当有机会变得比 long 更大时(这意味着组合很多哈希码),然后在最后一次),但是这个的一个变体是有效的对我来说(并且让您可以像其他一些哈希函数一样使用更大的素数而无需担心)。

Public Overrides Function GetHashCode() As Int32
    Dim hash as Int64 = 17
    hash = (hash * 23 + _Name.GetHashCode()) Mod Integer.MaxValue
    hash = (hash * 23 + _Value) Mod Integer.MaxValue
    hash = (hash * 23 + _Type.GetHashCode()) Mod Integer.MaxValue
    Return Convert.ToInt32(hash)
End Function

I had the same problem implementing Mr. Skeet's solution in vb.net. I ended up using the Mod operator to get there. Each Mod by Integer.MaxValue should return just the least significant component up to that point and will always be within Integer.MaxValue and Integer.MinValue -- which should have the same effect as unchecked. You probably don't have to mod as often as I do (it's only when there's a chance of getting bigger than a long (which would mean combining a LOT of hash codes) and then once at the end) but a variant of this works for me (and lets you play with using much bigger primes like some of the other hash functions without worrying).

Public Overrides Function GetHashCode() As Int32
    Dim hash as Int64 = 17
    hash = (hash * 23 + _Name.GetHashCode()) Mod Integer.MaxValue
    hash = (hash * 23 + _Value) Mod Integer.MaxValue
    hash = (hash * 23 + _Type.GetHashCode()) Mod Integer.MaxValue
    Return Convert.ToInt32(hash)
End Function
笔芯 2024-10-18 12:17:42

您可以使用 C# 和 unchecked 关键字或对整个项目进行溢出检查(在 VB.NET 和 C# 项目中都可以)在单独的程序集中实现合适的哈希代码帮助程序。如果您愿意,可以使用 ilmerge 将此程序集合并到主程序集。

You can implement a suitable hash code helper in a separate assembly either using C# and the unchecked keyword or turning overflow checking of for the entire project (possible in both VB.NET and C# projects). If you want to you can then use ilmerge to merge this assembly to your main assembly.

稀香 2024-10-18 12:17:42

改进的答案在VB中覆盖GetHashCode而不检查/取消检查关键字支持?

Public Overrides Function GetHashCode() as Integer
  Dim hashCode as Long = 0
  If myReplacePattern IsNot Nothing Then _
    hashCode = ((hashCode*397) Xor myField.GetHashCode()) And &HffffffffL
  If myPattern IsNot Nothing Then _
    hashCode = ((hashCode*397) Xor myOtherField.GetHashCode()) And &HffffffffL
  Return CInt(hashCode)
End Function

每次乘法后都有一个修剪。 And 字面量显式定义为 Long,因为带有 Integer 参数的 And 运算符不会将高位字节归零。

Improved answer Overriding GetHashCode in VB without checked/unchecked keyword support?

Public Overrides Function GetHashCode() as Integer
  Dim hashCode as Long = 0
  If myReplacePattern IsNot Nothing Then _
    hashCode = ((hashCode*397) Xor myField.GetHashCode()) And &HffffffffL
  If myPattern IsNot Nothing Then _
    hashCode = ((hashCode*397) Xor myOtherField.GetHashCode()) And &HffffffffL
  Return CInt(hashCode)
End Function

There is a trimming after each multiplication. And literal is defined explicitly as Long because the And operator with an Integer argument does not zeroize the upper bytes.

污味仙女 2024-10-18 12:17:42

在研究 VB 没有给我们带来任何类似于 unchecked 的东西并且愤怒了一段时间(c# dev 现在正在做 VB)之后,我实现了一个接近 Hans Passant 发布的解决方案。我失败了。糟糕的表现。这肯定是由于我的实施而不是汉斯发布的解决方案。我本可以回去更仔细地复制他的解决方案。

但是,我用不同的解决方案解决了这个问题。一篇抱怨 VB 语言功能请求页面上缺少 unchecked 的帖子给了我使用框架中已有的哈希算法的想法。在我的问题中,我有一个 StringGuid 我想将其用作字典键。我认为 Tuple(Of Guid, String) 将是一个很好的内部数据存储。

原始糟糕版本

Public Structure HypnoKey
  Public Sub New(name As String, areaId As Guid)
    _resourceKey = New Tuple(Of Guid, String)(resourceAreaId, key)
  End Sub

  Private ReadOnly _name As String
  Private ReadOnly _areaId As Guid

  Public ReadOnly Property Name As String
    Get
      Return _name 
    End Get
  End Property

  Public ReadOnly Property AreaId As Guid
    Get
      Return _areaId 
    End Get
  End Property

  Public Overrides Function GetHashCode() As Integer
    'OMFG SO BAD
    'TODO Fail less hard
  End Function

End Structure

大大改进的版本

Public Structure HypnoKey
  Public Sub New(name As String, areaId As Guid)
    _innerKey = New Tuple(Of Guid, String)(areaId , key)
  End Sub

  Private ReadOnly _innerKey As Tuple(Of Guid, String)

  Public ReadOnly Property Name As String
    Get
      Return _innerKey.Item2
    End Get
  End Property

  Public ReadOnly Property AreaId As Guid
    Get
      Return _innerKey.Item1
    End Get
  End Property

  Public Overrides Function GetHashCode() As Integer
    Return _innerKey.GetHashCode() 'wow! such fast (enuf)
  End Function

End Structure

因此,虽然我期望有比这更好的解决方案,但我很高兴。我的表现很好。此外,令人讨厌的实用程序代码也消失了。希望这对其他被迫编写 VB 的可怜开发者来说是有用的,他们看到这篇文章。

干杯

After researching that VB had not given us anything like unchecked and raging for a bit (c# dev now doing vb), I implemented a solution close to the one Hans Passant posted. I failed at it. Terrible performance. This was certainly due to my implementation and not the solution Hans posted. I could have gone back and more closely copied his solution.

However, I solved the problem with a different solution. A post complaining about lack of unchecked on the VB language feature requests page gave me the idea to use a hash algorithm already in the framework. In my problem, I had a String and Guid that I wanted to use for a dictionary key. I decided a Tupple(Of Guid, String) would be a fine internal data store.

Original Bad Version

Public Structure HypnoKey
  Public Sub New(name As String, areaId As Guid)
    _resourceKey = New Tuple(Of Guid, String)(resourceAreaId, key)
  End Sub

  Private ReadOnly _name As String
  Private ReadOnly _areaId As Guid

  Public ReadOnly Property Name As String
    Get
      Return _name 
    End Get
  End Property

  Public ReadOnly Property AreaId As Guid
    Get
      Return _areaId 
    End Get
  End Property

  Public Overrides Function GetHashCode() As Integer
    'OMFG SO BAD
    'TODO Fail less hard
  End Function

End Structure

Much Improved Version

Public Structure HypnoKey
  Public Sub New(name As String, areaId As Guid)
    _innerKey = New Tuple(Of Guid, String)(areaId , key)
  End Sub

  Private ReadOnly _innerKey As Tuple(Of Guid, String)

  Public ReadOnly Property Name As String
    Get
      Return _innerKey.Item2
    End Get
  End Property

  Public ReadOnly Property AreaId As Guid
    Get
      Return _innerKey.Item1
    End Get
  End Property

  Public Overrides Function GetHashCode() As Integer
    Return _innerKey.GetHashCode() 'wow! such fast (enuf)
  End Function

End Structure

So, while I expect there are far better solutions than this, I am pretty happy. My performance is good. Also, the nasty utility code is gone. Hopefully this is useful to some other poor dev forced to write VB who comes across this post.

Cheers

白鸥掠海 2024-10-18 12:17:42

我还发现 RemoveIntegerChecks MsBuild 属性会影响 /removeintchecks 防止编译器发出运行时检查的 VB 编译器属性:

  <PropertyGroup>
    <RemoveIntegerChecks>true</RemoveIntegerChecks>   
  </PropertyGroup>

I've also found that RemoveIntegerChecks MsBuild property affects /removeintchecks VB compiler property that prevents compiler from emitting runtime checks:

  <PropertyGroup>
    <RemoveIntegerChecks>true</RemoveIntegerChecks>   
  </PropertyGroup>
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文