因此,我试图找出如何在 VB 中为大量自定义对象正确重写 GetHashCode()
。经过一番搜索,我找到了 这个精彩的答案。
但有一个问题:VB 缺少 .NET 4.0 中的 checked
和 unchecked
关键字。据我所知,无论如何。因此,使用 Jon Skeet 的实现,我尝试在一个相当简单的类上创建这样的重写,该类具有三个主要成员:Name As String
、Value As Int32
和 [类型] 作为 System.Type
。因此我想出了:
Public Overrides Function GetHashCode() As Int32
Dim hash As Int32 = 17
hash = hash * 23 + _Name.GetHashCode()
hash = hash * 23 + _Value
hash = hash * 23 + _Type.GetHashCode()
Return hash
End Function
问题:Int32 对于像这样的简单对象来说也太小了。我测试的特定实例将“Name”作为一个简单的 5 个字符的字符串,并且该散列本身就足够接近 Int32 的上限,当它尝试计算散列的第二个字段(值)时,它会溢出。因为我找不到 VB 等效项来提供精细的 checked
/unchecked
支持,所以我无法解决这个问题。
我也不想删除整个项目中的整数溢出检查。这件事可能已经完成了……40%(我编的,TBH),而且我还有很多代码要写,所以我需要在相当长的一段时间内进行这些溢出检查。
对于 VB 和 Int32,Jon 的 GetHashCode
版本的“安全”版本是什么?或者,.NET 4.0 中是否有 checked
/unchecked
,而我在 MSDN 上不容易找到?
编辑:
根据链接的SO问题,最底部不受欢迎的答案提供了一个准解决方案。我说准是因为感觉就像是……作弊。但乞丐不能挑挑拣拣,对吧?
从 C# 转换为更具可读性的 VB 并与上述对象(名称、值、类型)对齐,我们得到:
Public Overrides Function GetHashCode() As Int32
Return New With { _
Key .A = _Name, _
Key .B = _Value, _
Key .C = _Type
}.GetHashCode()
End Function
这显然会触发编译器通过生成匿名类型来“作弊”,然后在项目外部对其进行编译命名空间,大概禁用了整数溢出检查,并允许进行数学计算并在溢出时简单地绕回。它似乎还涉及 box
操作码,我知道这会影响性能。不过没有拆箱。
但这提出了一个有趣的问题。我无数次在此处和其他地方看到 VB 和 C# 生成相同的 IL 代码。显然不是 100% 的情况都是如此...就像使用 C# 的 unchecked
关键字只会导致发出不同的操作码。那么,为什么我会继续看到两者产生完全相同的 IL 的假设不断重复?
无论如何,我宁愿找到一个可以在每个对象模块中实现的解决方案。从 ILDASM 的角度来看,必须为每个对象创建匿名类型会显得很混乱。当我说我的项目中实现了很多类时,我不是在开玩笑。
EDIT2:我确实在 MSFT Connect 上发现了一个错误,VB PM 结果的要点是他们会考虑它,但不要屏住呼吸:
https://connect.microsoft.com /VisualStudio/feedback/details/636564/checked-unchecked-keywords-in-visual-basic
快速浏览一下 .NET 4.5 中的变化表明他们还没有考虑过它,所以也许是 .NET 5?
我的最终实现符合 GetHashCode 的限制,同时对于 VB 来说仍然快速且足够独特,如下所示,源自 此页面:
'// The only sane way to do hashing in VB.NET because it lacks the
'// checked/unchecked keywords that C# has.
Public Const HASH_PRIME1 As Int32 = 4
Public Const HASH_PRIME2 As Int32 = 28
Public Const INT32_MASK As Int32 = &HFFFFFFFF
Public Function RotateHash(ByVal hash As Int64, ByVal hashcode As Int32) As Int64
Return ((hash << HASH_PRIME1) Xor (hash >> HASH_PRIME2) Xor hashcode)
End Function
我还认为“Shift-Add-XOR”哈希也可能适用,但我还没有测试过。
So I'm trying to figure out how to correctly override GetHashCode()
in VB for a large number of custom objects. A bit of searching leads me to this wonderful answer.
Except there's one problem: VB lacks both the checked
and unchecked
keyword in .NET 4.0. As far as I can tell, anyways. So using Jon Skeet's implementation, I tried creating such an override on a rather simple class that has three main members: Name As String
, Value As Int32
, and [Type] As System.Type
. Thus I come up with:
Public Overrides Function GetHashCode() As Int32
Dim hash As Int32 = 17
hash = hash * 23 + _Name.GetHashCode()
hash = hash * 23 + _Value
hash = hash * 23 + _Type.GetHashCode()
Return hash
End Function
Problem: Int32 is too small for even a simple object such as this. The particular instance I tested has "Name" as a simple 5-character string, and that hash alone was close enough to Int32's upper limit, that when it tried to calc the second field of the hash (Value), it overflowed. Because I can't find a VB equivalent for granular checked
/unchecked
support, I can't work around this.
I also do not want to remove Integer overflow checks across the entire project. This thing is maybe....40% complete (I made that up, TBH), and I have a lot more code to write, so I need these overflow checks in place for quite some time.
What would be the "safe" version of Jon's GetHashCode
version for VB and Int32? Or, does .NET 4.0 have checked
/unchecked
in it somewhere that I'm not finding very easily on MSDN?
EDIT:
Per the linked SO question, one of the unloved answers at the very bottom provided a quasi-solution. I say quasi because it feels like it's....cheating. Beggars can't be choosers, though, right?
Translated from from C# into a more readable VB and aligned to the object described above (Name, Value, Type), we get:
Public Overrides Function GetHashCode() As Int32
Return New With { _
Key .A = _Name, _
Key .B = _Value, _
Key .C = _Type
}.GetHashCode()
End Function
This triggers the compiler apparently to "cheat" by generating an anonymous type, which it then compiles outside of the project namespace, presumably with integer overflow checks disabled, and allows the math to take place and simply wrap around when it overflows. It also seems to involve box
opcodes, which I know to be performance hits. No unboxing, though.
But this raises an interesting question. Countless times, I've seen it stated here and elsewhere that both VB and C# generate the same IL code. This is clearly not the case 100% of the time...Like the use of C#'s unchecked
keyword simply causes a different opcode to get emitted. So why do I continue to see the assumption that both produce the exact same IL keep getting repeated? </rhetorical-question>
Anyways, I'd rather find a solution that can be implemented within each object module. Having to create Anonymous Types for every single one of my objects is going to look messy from an ILDASM perspective. I'm not kidding when I say I have a lot of classes implemented in my project.
EDIT2: I did open up a bug on MSFT Connect, and the gist of the outcome from the VB PM was that they'll consider it, but don't hold your breath:
https://connect.microsoft.com/VisualStudio/feedback/details/636564/checked-unchecked-keywords-in-visual-basic
A quick look at the changes in .NET 4.5 suggests they've not considered it yet, so maybe .NET 5?
My final implementation, which fits the constraints of GetHashCode, while still being fast and unique enough for VB is below, derived from the "Rotating Hash" example on this page:
'// The only sane way to do hashing in VB.NET because it lacks the
'// checked/unchecked keywords that C# has.
Public Const HASH_PRIME1 As Int32 = 4
Public Const HASH_PRIME2 As Int32 = 28
Public Const INT32_MASK As Int32 = &HFFFFFFFF
Public Function RotateHash(ByVal hash As Int64, ByVal hashcode As Int32) As Int64
Return ((hash << HASH_PRIME1) Xor (hash >> HASH_PRIME2) Xor hashcode)
End Function
I also think the "Shift-Add-XOR" hash may also apply, but I haven't tested it.
发布评论
评论(7)
使用 Long 避免溢出:
And 运算符可确保不会引发溢出异常。然而,这确实会在计算的哈希码中丢失一位“精度”,结果始终为正。 VB.NET 没有内置函数来避免它,但您可以使用一个技巧:
现在您可以编写:
Use Long to avoid the overflow:
The And operator ensures no overflow exception is thrown. This however does lose one bit of "precision" in the computed hash code, the result is always positive. VB.NET has no built-in function to avoid it, but you can use a trick:
Now you can write:
这是一个结合了 Hans Passant 的答案 和 乔恩·斯基特的回答。
它甚至适用于数百万个属性(即没有整数溢出异常),并且速度非常快(为具有 1,000,000 个字段的类生成哈希码不到 20 毫秒,而对于只有 100 个字段的类几乎无法测量)。
这是处理溢出的结构:
和一个简单的 GetHashCode 函数:
或者如果您愿意:
Here is an implementation combining Hans Passant's answer and Jon Skeet's answer.
It works even for millions of properties (i.e. no integer overflow exceptions) and is very fast (less than 20 ms to generate hash code for a class with 1,000,000 fields and barely measurable for a class with only 100 fields).
Here is the structure to handle the overflows:
And a simple GetHashCode function:
Or if your prefer:
我在 vb.net 中实现 Skeet 先生的解决方案时遇到了同样的问题。我最终使用 Mod 运算符到达那里。 Integer.MaxValue 的每个 Mod 应仅返回到该点的最不重要的分量,并且始终位于 Integer.MaxValue 和 Integer.MinValue 之内——这应该与未选中的效果相同。你可能不需要像我一样频繁地进行修改(只有当有机会变得比 long 更大时(这意味着组合很多哈希码),然后在最后一次),但是这个的一个变体是有效的对我来说(并且让您可以像其他一些哈希函数一样使用更大的素数而无需担心)。
I had the same problem implementing Mr. Skeet's solution in vb.net. I ended up using the Mod operator to get there. Each Mod by Integer.MaxValue should return just the least significant component up to that point and will always be within Integer.MaxValue and Integer.MinValue -- which should have the same effect as unchecked. You probably don't have to mod as often as I do (it's only when there's a chance of getting bigger than a long (which would mean combining a LOT of hash codes) and then once at the end) but a variant of this works for me (and lets you play with using much bigger primes like some of the other hash functions without worrying).
您可以使用 C# 和
unchecked
关键字或对整个项目进行溢出检查(在 VB.NET 和 C# 项目中都可以)在单独的程序集中实现合适的哈希代码帮助程序。如果您愿意,可以使用ilmerge
将此程序集合并到主程序集。You can implement a suitable hash code helper in a separate assembly either using C# and the
unchecked
keyword or turning overflow checking of for the entire project (possible in both VB.NET and C# projects). If you want to you can then useilmerge
to merge this assembly to your main assembly.改进的答案在VB中覆盖GetHashCode而不检查/取消检查关键字支持?
每次乘法后都有一个修剪。 And 字面量显式定义为 Long,因为带有 Integer 参数的 And 运算符不会将高位字节归零。
Improved answer Overriding GetHashCode in VB without checked/unchecked keyword support?
There is a trimming after each multiplication. And literal is defined explicitly as Long because the And operator with an Integer argument does not zeroize the upper bytes.
在研究 VB 没有给我们带来任何类似于
unchecked
的东西并且愤怒了一段时间(c# dev 现在正在做 VB)之后,我实现了一个接近 Hans Passant 发布的解决方案。我失败了。糟糕的表现。这肯定是由于我的实施而不是汉斯发布的解决方案。我本可以回去更仔细地复制他的解决方案。但是,我用不同的解决方案解决了这个问题。一篇抱怨 VB 语言功能请求页面上缺少
unchecked
的帖子给了我使用框架中已有的哈希算法的想法。在我的问题中,我有一个String
和Guid
我想将其用作字典键。我认为Tuple(Of Guid, String)
将是一个很好的内部数据存储。原始糟糕版本
大大改进的版本
因此,虽然我期望有比这更好的解决方案,但我很高兴。我的表现很好。此外,令人讨厌的实用程序代码也消失了。希望这对其他被迫编写 VB 的可怜开发者来说是有用的,他们看到这篇文章。
干杯
After researching that VB had not given us anything like
unchecked
and raging for a bit (c# dev now doing vb), I implemented a solution close to the one Hans Passant posted. I failed at it. Terrible performance. This was certainly due to my implementation and not the solution Hans posted. I could have gone back and more closely copied his solution.However, I solved the problem with a different solution. A post complaining about lack of
unchecked
on the VB language feature requests page gave me the idea to use a hash algorithm already in the framework. In my problem, I had aString
andGuid
that I wanted to use for a dictionary key. I decided aTupple(Of Guid, String)
would be a fine internal data store.Original Bad Version
Much Improved Version
So, while I expect there are far better solutions than this, I am pretty happy. My performance is good. Also, the nasty utility code is gone. Hopefully this is useful to some other poor dev forced to write VB who comes across this post.
Cheers
我还发现 RemoveIntegerChecks MsBuild 属性会影响 /removeintchecks 防止编译器发出运行时检查的 VB 编译器属性:
I've also found that RemoveIntegerChecks MsBuild property affects /removeintchecks VB compiler property that prevents compiler from emitting runtime checks: