为什么数据结构的大小通常为 2^n?
有历史原因还是什么原因吗?我已经见过很多次类似 char foo[256];
或 #define BUF_SIZE 1024
的东西。即使我也大多只使用 2n 大小的缓冲区,主要是因为我认为它看起来更优雅,这样我就不必考虑具体的数字。但我不太确定这是否是大多数人使用它们的原因,如果有更多信息,我们将不胜感激。
Is there a historical reason or something ? I've seen quite a few times something like char foo[256];
or #define BUF_SIZE 1024
. Even I do mostly only use 2n sized buffers, mostly because I think it looks more elegant and that way I don't have to think of a specific number. But I'm not quite sure if that's the reason most people use them, more information would be appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
可能有很多原因,尽管很多人会像你说的那样只是出于习惯而这样做。
它非常有用的一个地方是循环缓冲区的有效实现,特别是在 % 运算符昂贵的架构上(没有硬件除法的架构 - 主要是 8 位微控制器)。在这种情况下,通过使用 2^n 缓冲区,模数只是位屏蔽高位的情况,或者在 256 字节缓冲区的情况下,只需使用 8 位索引并使其环绕。
在其他情况下,与页面边界、缓存等对齐可能会为某些体系结构提供优化机会 - 但这将是非常特定于体系结构的。但可能只是这样的缓冲区为编译器提供了优化的可能性,所以在其他条件相同的情况下,为什么不呢?
There may be a number of reasons, although many people will as you say just do it out of habit.
One place where it is very useful is in the efficient implementation of circular buffers, especially on architectures where the % operator is expensive (those without a hardware divide - primarily 8 bit micro-controllers). By using a 2^n buffer in this case, the modulo, is simply a case of bit-masking the upper bits, or in the case of say a 256 byte buffer, simply using an 8-bit index and letting it wraparound.
In other cases alignment with page boundaries, caches etc. may provide opportunities for optimisation on some architectures - but that would be very architecture specific. But it may just be that such buffers provide the compiler with optimisation possibilities, so all other things being equal, why not?
高速缓存行通常是 2 的倍数(通常是 32 或 64)。该数量的整数倍的数据将能够适合(并充分利用)相应数量的缓存行。可以打包到缓存中的数据越多,性能就越好。所以我认为以这种方式设计结构的人正在为此进行优化。
Cache lines are usually some multiple of 2 (often 32 or 64). Data that is an integral multiple of that number would be able to fit into (and fully utilize) the corresponding number of cache lines. The more data you can pack into your cache, the better the performance.. so I think people who design their structures in that way are optimizing for that.
除了其他人提到的之外,另一个原因是,SSE 指令采用多个元素,并且输入的元素数量始终是 2 的幂。将缓冲区设置为 2 的幂可以保证您不会读取未分配的内存。但这仅适用于您实际使用 SSE 指令的情况。
但我认为最终,在大多数情况下压倒性的原因是程序员喜欢二的幂。
Another reason in addition to what everyone else has mentioned is, SSE instructions take multiple elements, and the number of elements input is always some power of two. Making the buffer a power of two guarantees you won't be reading unallocated memory. This only applies if you're actually using SSE instructions though.
I think in the end though, the overwhelming reason in most cases is that programmers like powers of two.
哈希表,按页分配
这对于哈希表确实很有帮助,因为您计算索引的大小为模,如果该大小是 2 的幂,则可以使用简单的按位与来计算模数或
&
而不是使用速度慢得多的除法指令来实现%
运算符。看一本旧的 Intel i386 书,
and
是 2 个周期,div
是 40 个周期。尽管整体周期时间快了 1000 倍,但往往会掩盖最慢机器操作的影响,但由于除法的基本复杂性大大提高,这种差异如今仍然存在。也曾有一段时间,偶尔会大力避免 malloc 开销。直接从操作系统可用的分配将是(仍然是)特定数量的页面,因此 2 的幂可能会充分利用分配粒度。
而且,正如其他人所指出的,程序员喜欢二的幂。
Hash Tables, Allocation by Pages
This really helps for hash tables, because you compute the index modulo the size, and if that size is a power of two, the modulus can be computed with a simple bitwise-and or
&
rather than using a much slower divide-class instruction implementing the%
operator.Looking at an old Intel i386 book,
and
is 2 cycles anddiv
is 40 cycles. A disparity persists today due to the much greater fundamental complexity of division, even though the 1000x faster overall cycle times tend to hide the impact of even the slowest machine ops.There was also a time when malloc overhead was occasionally avoided at great length. Allocation's available directly from the operating system would be (still are) a specific number of pages, and so a power of two would be likely to make the most use of the allocation granularity.
And, as others have noted, programmers like powers of two.
我可以立即想到几个原因:
SIZE_BYTE+ARRAY
,其中大小字节告诉您字符串的大小,那么效率最高。数组。这意味着数组可以是从 1 到 256 之间的任意大小。好吧,这些观点有点混乱。如果您需要进一步解释,请告诉我,特别是 IMO 中最重要的第 4 点。
I can think of a few reasons off the top of my head:
SIZE_BYTE+ARRAY
, where the size byte tells you the size of the array. This means the array can be any size from 1 to 256.Ok, those points came out a bit jumbled. Let me know if you need further explanation, especially point 4 which IMO is the most important.
由于电子学中 2 进制算术的简单性(另请参阅成本):左移(乘以 2)、右移(除以 2)。
在 CPU 领域,许多结构都围绕 2 进制算术展开。用于访问存储器结构的总线(控制和数据)通常在功率 2 上对齐。电子器件(例如 CPU)中逻辑实现的成本使得基数 2 中的算术引人注目。
当然,如果我们有模拟计算机,情况就会有所不同。
仅供参考:位于 X 层的系统的属性是 X 层的服务器层属性的直接结果位于 ie 层以下的系统 < x。我之所以这么说,是因为我收到了一些关于我的帖子的评论。
例如,可以在“编译器”级别操作的属性是继承和继承的。源自其下方系统(即 CPU 中的电子器件)的属性。
Because of the simplicity (read also cost) of base 2 arithmetic in electronics: shift left (multiply by 2), shift right (divide by 2).
In the CPU domain, lots of constructs revolve around base 2 arithmetic. Busses (control & data) to access memory structure are often aligned on power 2. The cost of logic implementation in electronics (e.g. CPU) makes for arithmetics in base 2 compelling.
Of course, if we had analog computers, the story would be different.
FYI: the attributes of a system sitting at layer X is a direct consequence of the server layer attributes of the system sitting below i.e. layer < x. The reason I am stating this stems from some comments I received with regards to my posting.
E.g. the properties that can be manipulated at the "compiler" level are inherited & derived from the properties of the system below it i.e. the electronics in the CPU.
我本来打算使用转变论点,但可以想出一个很好的理由来证明它的合理性。
对于 2 的幂的缓冲区来说,一个好处是循环缓冲区处理可以使用简单的与而不是除法:
如果它不是 2 的幂,则需要除法。在过去(目前在小芯片上)这很重要。
I was going to use the shift argument, but could think of a good reason to justify it.
One thing that is nice about a buffer that is a power of two is that circular buffer handling can use simple ands rather than divides:
If it weren't a power of two, a divide would be necessary. In the olden days (and currently on small chips) that mattered.
页大小为 2 的幂也很常见。
在 Linux 上,我喜欢在执行诸如对缓冲区进行分块并将其写入套接字或文件描述符之类的操作时使用 getpagesize()。
It's also common for pagesizes to be powers of 2.
On linux I like to use getpagesize() when doing something like chunking a buffer and writing it to a socket or file descriptor.
它是一个很好的以 2 为基数的整数。就像 10、100 或 1000000 都是很好的以 10 为基数的整数一样。
如果它不是 2 的幂(或接近的数字,例如 96=64+32 或 192= 128+64),那么您可能想知道为什么会增加精度。非以 2 为底的舍入大小可能来自外部限制或程序员的无知。您会想知道它是哪一个。
其他答案也指出了一系列在特殊情况下有效的技术原因。我不会在这里重复其中任何一个。
It's makes a nice, round number in base 2. Just as 10, 100 or 1000000 are nice, round numbers in base 10.
If it wasn't a power of 2 (or something close such as 96=64+32 or 192=128+64), then you could wonder why there's the added precision. Not base 2 rounded size can come from external constraints or programmer ignorance. You'll want to know which one it is.
Other answers have pointed out a bunch of technical reasons as well that are valid in special cases. I won't repeat any of them here.
在哈希表中,2^n 使得以某种方式更容易处理键冲突。一般来说,当发生密钥冲突时,您要么创建一个子结构,例如具有相同哈希值的所有条目的列表;要么创建一个子结构,例如列表,其中包含具有相同哈希值的所有条目。或者您找到另一个空闲插槽。您可以将槽索引加 1,直到找到空闲槽;但这种策略并不是最优的,因为它会产生阻塞区域的集群。更好的策略是计算第二个哈希数h2,使得gcd(n,h2)=1;然后将 h2 添加到槽索引,直到找到空闲槽(带有环绕)。如果n是2的幂,找到满足gcd(n,h2)=1的h2很容易,每个奇数都可以。
In hash tables, 2^n makes it easier to handle key collissions in a certain way. In general, when there is a key collission, you either make a substructure, e.g. a list, of all entries with the same hash value; or you find another free slot. You could just add 1 to the slot index until you find a free slot; but this strategy is not optimal, because it creates clusters of blocked places. A better strategy is to calculate a second hash number h2, so that gcd(n,h2)=1; then add h2 to the slot index until you find a free slot (with wrap around). If n is a power of 2, finding a h2 that fulfills gcd(n,h2)=1 is easy, every odd number will do.