为什么数据结构的大小通常为 2^n?

发布于 2024-08-12 14:35:51 字数 199 浏览 5 评论 0原文

有历史原因还是什么原因吗?我已经见过很多次类似 char foo[256];#define BUF_SIZE 1024 的东西。即使我也大多只使用 2n 大小的缓冲区,主要是因为我认为它看起来更优雅,这样我就不必考虑具体的数字。但我不太确定这是否是大多数人使用它们的原因,如果有更多信息,我们将不胜感激。

Is there a historical reason or something ? I've seen quite a few times something like char foo[256]; or #define BUF_SIZE 1024. Even I do mostly only use 2n sized buffers, mostly because I think it looks more elegant and that way I don't have to think of a specific number. But I'm not quite sure if that's the reason most people use them, more information would be appreciated.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

盛装女皇 2024-08-19 14:35:51

可能有很多原因,尽管很多人会像你说的那样只是出于习惯而这样做。

它非常有用的一个地方是循环缓冲区的有效实现,特别是在 % 运算符昂贵的架构上(没有硬件除法的架构 - 主要是 8 位微控制器)。在这种情况下,通过使用 2^n 缓冲区,模数只是位屏蔽高位的情况,或者在 256 字节缓冲区的情况下,只需使用 8 位索引并使其环绕。

在其他情况下,与页面边界、缓存等对齐可能会为某些体系结构提供优化机会 - 但这将是非常特定于体系结构的。但可能只是这样的缓冲区为编译器提供了优化的可能性,所以在其他条件相同的情况下,为什么不呢?

There may be a number of reasons, although many people will as you say just do it out of habit.

One place where it is very useful is in the efficient implementation of circular buffers, especially on architectures where the % operator is expensive (those without a hardware divide - primarily 8 bit micro-controllers). By using a 2^n buffer in this case, the modulo, is simply a case of bit-masking the upper bits, or in the case of say a 256 byte buffer, simply using an 8-bit index and letting it wraparound.

In other cases alignment with page boundaries, caches etc. may provide opportunities for optimisation on some architectures - but that would be very architecture specific. But it may just be that such buffers provide the compiler with optimisation possibilities, so all other things being equal, why not?

往事随风而去 2024-08-19 14:35:51

高速缓存行通常是 2 的倍数(通常是 32 或 64)。该数量的整数倍的数据将能够适合(并充分利用)相应数量的缓存行。可以打包到缓存中的数据越多,性能就越好。所以我认为以这种方式设计结构的人正在为此进行优化。

Cache lines are usually some multiple of 2 (often 32 or 64). Data that is an integral multiple of that number would be able to fit into (and fully utilize) the corresponding number of cache lines. The more data you can pack into your cache, the better the performance.. so I think people who design their structures in that way are optimizing for that.

还在原地等你 2024-08-19 14:35:51

除了其他人提到的之外,另一个原因是,SSE 指令采用多个元素,并且输入的元素数量始终是 2 的幂。将缓冲区设置为 2 的幂可以保证您不会读取未分配的内存。但这仅适用于您实际使用 SSE 指令的情况。

但我认为最终,在大多数情况下压倒性的原因是程序员喜欢二的幂。

Another reason in addition to what everyone else has mentioned is, SSE instructions take multiple elements, and the number of elements input is always some power of two. Making the buffer a power of two guarantees you won't be reading unallocated memory. This only applies if you're actually using SSE instructions though.

I think in the end though, the overwhelming reason in most cases is that programmers like powers of two.

神经暖 2024-08-19 14:35:51

哈希表,按页分配

这对于哈希表确实很有帮助,因为您计算索引的大小为模,如果该大小是 2 的幂,则可以使用简单的按位与来计算模数或 & 而不是使用速度慢得多的除法指令来实现 % 运算符。

看一本旧的 Intel i386 书,and 是 2 个周期,div 是 40 个周期。尽管整体周期时间快了 1000 倍,但往往会掩盖最慢机器操作的影响,但由于除法的基本复杂性大大提高,这种差异如今仍然存在。

也曾有一段时间,偶尔会大力避免 malloc 开销。直接从操作系统可用的分配将是(仍然是)特定数量的页面,因此 2 的幂可能会充分利用分配粒度。

而且,正如其他人所指出的,程序员喜欢二的幂。

Hash Tables, Allocation by Pages

This really helps for hash tables, because you compute the index modulo the size, and if that size is a power of two, the modulus can be computed with a simple bitwise-and or & rather than using a much slower divide-class instruction implementing the % operator.

Looking at an old Intel i386 book, and is 2 cycles and div is 40 cycles. A disparity persists today due to the much greater fundamental complexity of division, even though the 1000x faster overall cycle times tend to hide the impact of even the slowest machine ops.

There was also a time when malloc overhead was occasionally avoided at great length. Allocation's available directly from the operating system would be (still are) a specific number of pages, and so a power of two would be likely to make the most use of the allocation granularity.

And, as others have noted, programmers like powers of two.

痴梦一场 2024-08-19 14:35:51

我可以立即想到几个原因:

  1. 2^n 是所有计算机尺寸中非常常见的值。这与计算机中表示位的方式(2 个可能的值)直接相关,这意味着变量往往具有边界为 2^n 的值范围。
  2. 由于上述原因,您经常会发现缓冲区的大小为 256。这是因为它是一个字节中可以存储的最大数字。因此,如果您想将字符串与字符串的大小一起存储,那么如果将其存储为:SIZE_BYTE+ARRAY,其中大小字节告诉您字符串的大小,那么效率最高。数组。这意味着数组可以是从 1 到 256 之间的任意大小。
  3. 很多时候,大小是根据物理事物来选择的(例如,操作系统可以选择的内存大小与 CPU 寄存器的大小有关)等)并且这些也将是特定数量的位。这意味着,您可以使用的内存量通常为 2^n 的某个值(对于 32 位系统,为 2^32)。
  4. 这些值可能存在性能优势/对齐问题。大多数处理器一次可以访问一定数量的字节,因此即使您有一个大小为 20 位的变量,无论如何,32 位处理器仍然会读取 32 位。因此,将变量设为 32 位通常会更高效。此外,某些处理器要求变量与一定数量的字节对齐(因为它们无法从例如内存中的奇数地址读取内存)。当然,有时并不是奇数内存位置,而是 4 的倍数或 8 的 6 倍等的位置。因此,在这些情况下,制作始终对齐的缓冲区会更有效。

好吧,这些观点有点混乱。如果您需要进一步解释,请告诉我,特别是 IMO 中最重要的第 4 点。

I can think of a few reasons off the top of my head:

  1. 2^n is a very common value in all of computer sizes. This is directly related to the way bits are represented in computers (2 possible values), which means variables tend to have ranges of values whose boundaries are 2^n.
  2. Because of the point above, you'll often find the value 256 as the size of the buffer. This is because it is the largest number that can be stored in a byte. So, if you want to store a string together with a size of the string, then you'll be most efficient if you store it as: SIZE_BYTE+ARRAY, where the size byte tells you the size of the array. This means the array can be any size from 1 to 256.
  3. Many other times, sizes are chosen based on physical things (for example, the size of the memory an operating system can choose from is related to the size of the registers of the CPU etc) and these are also going to be a specific amount of bits. Meaning, the amount of memory you can use will usually be some value of 2^n (for a 32bit system, 2^32).
  4. There might be performance benefits/alignment issues for such values. Most processors can access a certain amount of bytes at a time, so even if you have a variable whose size is let's say) 20 bits, a 32 bit processor will still read 32 bits, no matter what. So it's often times more efficient to just make the variable 32 bits. Also, some processors require variables to be aligned to a certain amount of bytes (because they can't read memory from, for example, addresses in the memory that are odd). Of course, sometimes it's not about odd memory locations, but locations that are multiples of 4, or 6 of 8, etc. So in these cases, it's more efficient to just make buffers that will always be aligned.

Ok, those points came out a bit jumbled. Let me know if you need further explanation, especially point 4 which IMO is the most important.

挽手叙旧 2024-08-19 14:35:51

由于电子学中 2 进制算术的简单性(另请参阅成本):左移(乘以 2)、右移(除以 2)。

在 CPU 领域,许多结构都围绕 2 进制算术展开。用于访问存储器结构的总线(控制和数据)通常在功率 2 上对齐。电子器件(例如 CPU)中逻辑实现的成本使得基数 2 中的算术引人注目。

当然,如果我们有模拟计算机,情况就会有所不同。


仅供参考:位于 X 层的系统的属性是 X 层的服务器层属性的直接结果位于 ie 层以下的系统 < x。我之所以这么说,是因为我收到了一些关于我的帖子的评论。

例如,可以在“编译器”级别操作的属性是继承和继承的。源自其下方系统(即 CPU 中的电子器件)的属性。

Because of the simplicity (read also cost) of base 2 arithmetic in electronics: shift left (multiply by 2), shift right (divide by 2).

In the CPU domain, lots of constructs revolve around base 2 arithmetic. Busses (control & data) to access memory structure are often aligned on power 2. The cost of logic implementation in electronics (e.g. CPU) makes for arithmetics in base 2 compelling.

Of course, if we had analog computers, the story would be different.


FYI: the attributes of a system sitting at layer X is a direct consequence of the server layer attributes of the system sitting below i.e. layer < x. The reason I am stating this stems from some comments I received with regards to my posting.

E.g. the properties that can be manipulated at the "compiler" level are inherited & derived from the properties of the system below it i.e. the electronics in the CPU.

十雾 2024-08-19 14:35:51

我本来打算使用转变论点,但可以想出一个很好的理由来证明它的合理性。

对于 2 的幂的缓冲区来说,一个好处是循环缓冲区处理可以使用简单的与而不是除法:

#define BUFSIZE 1024

++index;                // increment the index.
index &= BUFSIZE;       // Make sure it stays in the buffer.

如果它不是 2 的幂,则需要除法。在过去(目前在小芯片上)这很重要。

I was going to use the shift argument, but could think of a good reason to justify it.

One thing that is nice about a buffer that is a power of two is that circular buffer handling can use simple ands rather than divides:

#define BUFSIZE 1024

++index;                // increment the index.
index &= BUFSIZE;       // Make sure it stays in the buffer.

If it weren't a power of two, a divide would be necessary. In the olden days (and currently on small chips) that mattered.

伤感在游骋 2024-08-19 14:35:51

页大小为 2 的幂也很常见。

在 Linux 上,我喜欢在执行诸如对缓冲区进行分块并将其写入套接字或文件描述符之类的操作时使用 getpagesize()。

It's also common for pagesizes to be powers of 2.

On linux I like to use getpagesize() when doing something like chunking a buffer and writing it to a socket or file descriptor.

很快妥协 2024-08-19 14:35:51

它是一个很好的以 2 为基数的整数。就像 10、100 或 1000000 都是很好的以 10 为基数的整数一样。

如果它不是 2 的幂(或接近的数字,例如 96=64+32 或 192= 128+64),那么您可能想知道为什么会增加精度。非以 2 为底的舍入大小可能来自外部限制或程序员的无知。您会想知道它是哪一个。

其他答案也指出了一系列在特殊情况下有效的技术原因。我不会在这里重复其中任何一个。

It's makes a nice, round number in base 2. Just as 10, 100 or 1000000 are nice, round numbers in base 10.

If it wasn't a power of 2 (or something close such as 96=64+32 or 192=128+64), then you could wonder why there's the added precision. Not base 2 rounded size can come from external constraints or programmer ignorance. You'll want to know which one it is.

Other answers have pointed out a bunch of technical reasons as well that are valid in special cases. I won't repeat any of them here.

独夜无伴 2024-08-19 14:35:51

在哈希表中,2^n 使得以某种方式更容易处理键冲突。一般来说,当发生密钥冲突时,您要么创建一个子结构,例如具有相同哈希值的所有条目的列表;要么创建一个子结构,例如列表,其中包含具有相同哈希值的所有条目。或者您找到另一个空闲插槽。您可以将槽索引加 1,直到找到空闲槽;但这种策略并不是最优的,因为它会产生阻塞区域的集群。更好的策略是计算第二个哈希数h2,使得gcd(n,h2)=1;然后将 h2 添加到槽索引,直到找到空闲槽(带有环绕)。如果n是2的幂,找到满足gcd(n,h2)=1的h2很容易,每个奇数都可以。

In hash tables, 2^n makes it easier to handle key collissions in a certain way. In general, when there is a key collission, you either make a substructure, e.g. a list, of all entries with the same hash value; or you find another free slot. You could just add 1 to the slot index until you find a free slot; but this strategy is not optimal, because it creates clusters of blocked places. A better strategy is to calculate a second hash number h2, so that gcd(n,h2)=1; then add h2 to the slot index until you find a free slot (with wrap around). If n is a power of 2, finding a h2 that fulfills gcd(n,h2)=1 is easy, every odd number will do.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文