如何在 c++ 中管理位/二进制?
我需要做的是打开一个包含 0 和 1 的文本文件,以查找文件中各列之间的模式。
所以我的第一个想法是将每一列解析为一个大的布尔数组,然后在列之间执行逻辑(现在在数组中)。直到我发现 bool 的大小实际上是一个字节而不是一个位,所以我会浪费 1/8 的内存,将每个值分配给一个 bool。
它甚至与 800x800 值的网格相关吗?处理这个问题的最佳方法是什么? 我希望有一个代码片段,以防它是一个复杂的答案
What I need to do is open a text file with 0s and 1s to find patterns between the columns in the file.
So my first thought was to parse each column into a big array of bools, and then do the logic between the columns (now in arrays). Until I found that the size of bools is actually a byte not a bit, so i would be wasting 1/8 of memory, assigning each value to a bool.
Is it even relevant in a grid of 800x800 values? What would be the best way to handle this?
I would appreciate a code snippet in case its a complicated answer
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您可以使用 std::bitset 或 Boosts dynamic_bitset 提供了不同的方法来帮助您管理位。
例如,它们支持从其他默认类型(如 int 或 char)创建位集的构造函数。您还可以将位集导出到 ulong 或字符串中(然后可以再次将其转换为位集等)
我曾经询问过如何连接这些位集,但这是不可能做到的。但也许您也可以使用该问题中的信息。
You could use std::bitset or Boosts dynamic_bitset which provide different methods which will help you manage your bits.
They for example support constructors which create bitsets from other default types like int or char. You can also export the bitset into an ulong or into a string (which then could be turned into a bitset again etc)
I once asked about concatenating those, which wasn't performantly possible to do. But perhaps you could use the info in that question too.
您可以使用
std::vector
这是向量的特化,它使用布尔值的紧凑存储......1 位而不是 8 位。you can use
std::vector<bool>
which is a specialization of vector that uses a compact store for booleans....1 bit not 8 bits.我认为高德纳曾说过“过早的优化是万恶之源”。让我们进一步了解这个问题。你的数组是 800**2 == 640,000 字节,这对于比数字手表更强大的东西来说没什么大不了的。
虽然将其存储为字节可能看起来很浪费——正如你所说,7/8 的内存是多余的——但另一方面,大多数机器不会像字节那样有效地执行位操作;通过节省内存,您可能会浪费大量精力进行屏蔽和测试,而使用字节模型会更好。
另一方面,如果您想要用它做的是寻找更大的模式,您可能需要使用按位表示,因为您一次可以使用 8 位进行操作。
这里真正的要点是,存在多种可能性,但没有人可以在不知道问题是什么的情况下告诉您“正确”的表示。
I think it was Knuth who said "premature optimization is the root of all evil." Let's find out a little bit more about the problem. Your array is 800**2 == 640,000 bytes, which is no big deal on anything more powerful than a digital watch.
While storing it as bytes may seem wasteful -- as you say, 7/8ths of the memory is redundant -- but on the other hand, most machines don't do bit operations as efficiently as bytes; by saving the memory, you might waste so much effort masking and testing that you would have been better off with the bytes model.
On the other hand, if what you want to do with it is look for larger patterns, you might want to use a bitwise representation because you can do things with 8 bits at a time.
The real point here is that there are several possibilities, but no one can tell you the "right" representation without knowing what the problem is.
对于该大小的网格,您的布尔数组约为 640KB。如果这会成为问题,则取决于您有多少内存。对于逻辑分析代码来说,这可能是最简单的。
通过将位分组并存储在 int 数组中,您可以将内存需求降低到 80KB,但逻辑代码会更复杂,因为您总是隔离要检查的位。
For that size grid your array of bools would be about 640KB. Depends how much memory you have if that will be a problem. It would probably be the simplest for the logic analysis code.
By grouping the bits and storing in an array of int you could drop the memory requirement to 80KB, but the logic code would be more complicated as you'd be always isolating the bits you wanted to check.