C++:哪个更快 - 在 hashmap 中查找还是在 switch 语句中查找?
我有一种代码模式,可以将一个整数转换为另一个整数。就这样:
int t(int value) {
switch (value) {
case 1: return const_1;
case 3: return const_2;
case 4: return const_3;
case 8: return const_4;
default: return 0;
}
}
目前大概有50条左右,以后可能会更多,但大概不会超过一百两条。所有值都是预定义的,当然我可以按它们的值对案例标签进行排序。所以问题是,什么会更快 - 这种方法或将其放入哈希映射(我无法访问 std::map,所以我正在谈论我的 SDK 中可用的自定义哈希映射)并在该表中执行查找?也许这有点过早的优化,但......但我只需要你的意见。
提前致谢。
编辑:我的情况值将在 0 到 0xffff 的范围内。以及关于哈希图更好的可读性的观点。我不确定它是否真的具有更好的可读性,因为我仍然需要用值填充它,因此常量映射表仍然需要位于我的代码中的某个位置。
EDIT-2:已经给出了许多有用的答案,非常感谢。我想在这里添加一些信息。我的哈希键是整数,而我的整数哈希函数基本上只是一个带有整数溢出的乘法:
EXPORT_C __NAKED__ unsigned int DefaultHash::Integer(const int& /*aInt*/)
{
_asm mov edx, [esp+4]
_asm mov eax, 9E3779B9h
_asm mul dword ptr [edx]
_asm ret
}
所以它应该相当快。
I have a code pattern which translates one integer to another. Just like this:
int t(int value) {
switch (value) {
case 1: return const_1;
case 3: return const_2;
case 4: return const_3;
case 8: return const_4;
default: return 0;
}
}
It has about 50 entries currently, maybe later on there will be some more, but probably no more than hundred or two. All the values are predefined, and of course I can order case labels by their values. So the question is, what will be faster - this approach or put this into hash map (I have no access to std::map, so I'm speaking about custom hash map available in my SDK) and perform lookups in that table? Maybe it's a bit of premature optimization, though... But I just need your opinions.
Thanks in advance.
EDIT: My case values are going to be in range from 0 to 0xffff. And regarding the point of better readability of hash map. I'm not sure it really will have better readability, because I still need to populate it with values, so that sheet of constants mapping is still needs to be somewhere in my code.
EDIT-2: Many useful answers were already given, much thanks. I'd like to add some info here. My hash key is integer, and my hash function for integer is basically just one multiplication with integral overflow:
EXPORT_C __NAKED__ unsigned int DefaultHash::Integer(const int& /*aInt*/)
{
_asm mov edx, [esp+4]
_asm mov eax, 9E3779B9h
_asm mul dword ptr [edx]
_asm ret
}
So it should be quite fast.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(9)
switch
构造更快(或者至少不慢)。这主要是因为
switch
构造向编译器提供静态数据,而像哈希映射这样的运行时结构则不然。如果可能,编译器应将
switch
结构编译为代码指针数组:数组的每一项(由索引索引)都指向关联的代码。在运行时,这需要 O(1),而哈希映射可能需要更多:平均情况下为 O(log n) 或最坏情况下为 O(n),通常,并且无论如何都会有更大的恒定数量的内存访问。A
switch
construct is faster (or at least not slower).That's mostly because a
switch
construct gives static data to the compiler, while a runtime structure like a hash map doesn't.When possible compilers should compile
switch
constructs into array of code pointers: each item of the array (indexed by your indexes) points to the associated code. At runtime this takes O(1), while a hash map could take more: O(log n) at average case or O(n) at worst case, usually, and anyway a bigger constant number of memory accesses.我将添加我的 5 美分:
对于大约 50 个 std::unordered_map (基于哈希,O(1))的条目数,通常比 std::map (基于树的 O(ln(N)))慢,并且两者都其中的速度比 boost::flat_map(sorted vector O(ln(N))) 慢,我倾向于在这种情况下使用它。
switch 并不总是可以编译为跳转表,当它是的时候,您可以简单地将您的值(或函数)自己放入向量中并通过索引访问。否则 switch 比 boost::flat_map 稍微快一些。
请注意开头的“通常”一词,如果您确实关心这段代码的性能,请进行分析(并与我们分享结果:))。
I will add my 5 cents:
For the number of entries at about 50 std::unordered_map (hash based, O(1)) is typically slower then std::map (tree based O(ln(N))), and both of them are slower then boost::flat_map(sorted vector O(ln(N))) which I tend to use in such cases.
It is not always the case that switch can be compiled to jump table, and when it is, you can simply put your values (or functions) in vector yourself and access by index. Otherwise switch is marginally faster than boost::flat_map.
Please note the word "typically" in the beginning, if you do care about performance of this piece of code do the profiling (and share results with us :)).
switch 语句
将比在哈希映射中查找更快。但是,如果您更改映射,映射将产生更具可读性的代码。您可以通过从文件中读取结果来轻松地使用地图来完成此操作。在 switch 语句中,您必须更改代码并重新编译。
A
switch statement
is going to be quicker than a look up in a hash map.However, a map is going to result in much more readable code if you ever change the mappings. You can easily do this with a map by reading the results in from a file. In a switch statement you'd have to change the code and recompile.
切换会更快。如果情况很少,如您的示例所示,它将使用 if 链。如果情况很多,并且它们相当紧凑,则可以选择生成跳转表,该跳转表只需要几条指令。 (顺便说一句,您不必对案例进行排序。)哈希映射的复杂度为 O(1),但可能会采用 10-40 条指令。
The switch will be faster. If it's a small number of cases, as in your example, it will use an if-chain. If a large number of cases, and if they are reasonably compact, it has the option to generate a jump-table, which only takes a few instructions. (BTW you don't have to order the cases.) The hash-map is O(1), but will probably take in the range of 10-40 instructions.
根据定义,数组将具有最快的访问时间。
switch 语句比较值,然后使用跳转表(它是函数指针数组)。
hashmap 根据数据计算哈希值,然后搜索内存中的树或使用哈希值作为数组的索引。因为计算哈希值所以慢。
在大多数现代平台上,64k 并不是一个很大的数据量,可以作为常量数组进行静态分配。
数组技术的一个问题是考虑了您没有考虑到的键。一个示例是使用唯一的哨兵值。当返回值时,您就知道您有一个未知的密钥。
我建议使用
static const
值数组。An array will have the fastest access time, by definition.
The switch statement compares values, then uses a jump table (which is an array of function pointers).
The hashmap computes a hash value from the data, then either searches a tree in memory or uses the hash value as an index into an array. Slow because of computing the hash value.
On most modern platforms, 64k, is not a big amount of data and can be statically allocated as a constant array.
One problem with the array technique is account for keys that you have not accounted for. One example is to use a unique sentinel value. When the value is returned, you know you have an unknown key.
I suggest using a
static const
array of values.哈希映射的速度取决于两个因素:哈希函数的速度和冲突的数量。当提前知道所有值时,可以创建一个完美的哈希函数,该函数具有没有碰撞。如果您可以生成一个仅包含几个算术运算的完美哈希函数,那么它可能比交换机更快。
The speed of a hash map will depend on two things: the speed of the hash function, and the number of collisions. When all of the values are known ahead of time, it's possible to create a perfect hash function that has no collisions. If you can generate a perfect hash function that only consists of a couple of arithmetic operations, it will potentially be faster than the switch.
我同意使用数组,但我没有投票支持它的声誉。它只有 65536 个条目,因此除非您受到严重的内存限制和/或返回非常大的内容,否则使用静态 const 数组会更好,而不是像您的示例那样 int 。拥有 64k int 数组通常只有 256kB,如果可以使用 Short 或 char,则大小将是该大小的一半或 1/4。我认为您对 switch 语句所能期望的最好结果是为代码指针数组之外的值提供一个条件分支,并为数组内的值跳转到第二个条件分支。能够执行“return my_array[value]”只会导致内存获取(可能来自 l3 缓存)。
为了便于阅读,您可以将数组粘贴在其自己的文件中,并将所有值排列在网格中,每行有 10 或 16 个条目。然后,用每个条目编号的第一部分注释每一行(例如“// 0x12A?”),并定期添加注释行,这些注释行将与列对齐以填充条目编号的最后一位数字(例如“// 0 1 2 3 4 5 6 7 8 9 ABCDE F")。我已经对多个包含 256 个条目的数组执行了此操作,这比 switch 语句更容易管理。我还使用了具有 64k 条目的数组来实现快速整数对数,这使得管理变得更加复杂,但我能够编写一个程序来生成所有数组代码。
对于这么大的东西,代码管理可能不会更容易,直到您处理更多条目,但这取决于您的编辑器和技能。维护这样的数组只是调整图表中的一个点,而不是寻找可能或可能不在“case 1:return const_1;”的长列表中的值。几个 for 循环应该足以生成 64k 条目的数组,这些条目已正确注释并填充了默认值。
为了访问安全,您可以考虑使用某种边界检查。这可以通过 boost 的前提条件来完成,如果数字超出范围,则抛出异常或特殊返回,或者简单的“return my_array[value&0xffff]”。然而,您可能对您的传入价值有足够强有力的保证,因此您不需要任何它。
I agree with using an array, but I don't have the reputation to vote for it. It's only 65536 entries, so unless you're under serious memory constraints and/or you're returning something very large, instead of int like your example, you will be much better off with a static const array. Having an array of 64k int's is generally only 256kB, and it would be half or 1/4 that size if you can use a short or char. I think the best you can hope for with a switch statement is a conditional branch for values outside of it's array of code pointers and a second conditional branch to jump to the code for a value inside the array. Being able to just execute "return my_array[value]" will just result in a memory fetch (possibly from l3 cache).
For readability, you can stick the array in its own file and line up all the values in a grid with something like 10 or 16 entries per line. Then you comment each line with the first part of each entry number (e.g. "// 0x12A?" ), and have periodic comment lines that would line up with the columns to fill in the last digit for the entry number (e.g. "// 0 1 2 3 4 5 6 7 8 9 A B C D E F"). I've done this for several arrays of 256 entries, which has been much easier to manage than a switch statement. I've also used arrays with 64k entries for fast integer logarithms, which get more complicated to manage, but I was able to write a program to generate all the array code.
With something that large, code management may not be easier until you're dealing with more entries, but it depends on your editor and skill with it. Maintaining such an array is just adjusting a spot in a chart instead of hunting down values that may or may not be in a long list of "case 1: return const_1;". A couple of for loops should suffice to generate an array of 64k entries, that are properly commented and filled with default values.
For access safety, you might consider using some sort of bounds checking. This could be done with boost's preconditions, throwing an exception or special return if the number is out of bounds, or a simple "return my_array[value&0xffff]". However, you might have a strong enough guarantee on your incoming value that you don't need any of it.
我认为哪个更快并不明显。您可能需要分析这两种方法。
哈希映射的复杂度应该为 O(1)。
开关(具有像您这样的非连续键)可以优化为二分搜索(至少使用 GCC),其复杂度为 O(log n)。
另一方面,在哈希映射上完成的任何操作都将比在交换机上完成的操作昂贵得多。
I think it is not obvious which is going to be faster. You might need to profile both approaches.
The hash map should have complexity of O(1).
The switch (with non-contiguous keys like yours) may be optimized into a binary search (at least with GCC), which has complexity of O(log n).
On the other hand, any operation done on a hash map will be much more expensive than an operation done in a switch.
在不考虑碰撞的情况下,哈希表的时间复杂度一般为O(1)。
C++ 标准没有指定 switch 是如何实现的,但它可以实现为跳转表,时间复杂度为 O(1),也可以实现为二分查找,时间复杂度为 O(log n) 或组合,具体取决于有多少 case 语句等。
总之,像你的情况这样的小规模,switch 更快,但哈希表可能会在大规模中胜出
Hash table time complexity is generally O(1) when don't considering collision.
C++ standard doesn't specified how switch is implemented but it can be implemented as jump-table which time complexity is O(1) too or it can be implemented as binary search which time complexity is O(log n) or a combination depending on how many case statement etc.
So in a word, small scale like your case, switch is faster, but hash table might win in large scale