Java 中的非常紧凑的位数组
我正在寻找一种非常紧凑的方法来在 Java 中存储密集的可变长度位数组。现在,我正在使用 BitSet
,但对于大小为 n 的位向量,它似乎平均使用 1.5*n 位 的存储空间>。通常,这不是问题,但在这种情况下,存储的位数组是应用程序内存占用的相当重要的一部分。因此,让它们变小一点确实会有帮助。
BitSet 所需的空间似乎是由于用于支持数据结构的 long 数组每次扩展以容纳更多位时往往会加倍:
// BitSet's resizing code
private void ensureCapacity(int wordsRequired) {
if (words.length < wordsRequired) {
// Allocate larger of doubled size or required size
int request = Math.max(2 * words.length, wordsRequired);
words = Arrays.copyOf(words, request);
sizeIsSticky = false;
}
}
我可以编写自己的 BitSet 替代实现来缩放后端数据结构更加保守。但是,如果不需要的话,我真的不想重复标准类库中已有的功能。
I'm looking for a very compact way of storing a dense variable length bitarray in Java. Right now, I'm using BitSet
, but it seems to use on average 1.5*n bits of storage space for a bit vector of size n. Typically, this isn't a problem, but in this case the bitarrays being stored are a pretty significant part the memory footprint of the application. So, it would really help to get them to be a little bit smaller.
The space required by BitSet seems to be due to the fact that the array of longs used to back the data structure tends to double each time it is expanded to hold more bits:
// BitSet's resizing code
private void ensureCapacity(int wordsRequired) {
if (words.length < wordsRequired) {
// Allocate larger of doubled size or required size
int request = Math.max(2 * words.length, wordsRequired);
words = Arrays.copyOf(words, request);
sizeIsSticky = false;
}
}
I could write my own alternative implementation of BitSet that scales the backend data structure more conservatively. But, I'd really hate to duplicate functionality that is already in the standard class libraries if I don't have to.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您使用构造函数
BitSet(int nbits)
创建 BitSet,则可以指定容量。如果你猜错了容量,再往前走,容量就会增加一倍。BitSet
类确实有一个trimToSize
方法,该方法是私有的,由 writeObject 和 clone() 调用。如果您克隆对象或序列化它,它会将其修剪到正确的长度(假设类通过 EnsureCapacity 方法过度扩展它)。If you create the
BitSet
using the constructorBitSet(int nbits)
you can specify the capacity. If you guess the capacity wrong, and go over, it will double the size.The
BitSet
class does have atrimToSize
method which is private, and is called by writeObject and clone(). If you clone your object, or serialize it, it will trim it to the correct length (assuming the class over expanded it through the ensureCapacity method).您可能会受益于压缩的 BitSet 替代方案。例如,请参见:
https://github.com/lemire/javaewah
http://roaringbitmap.org/
You might benefit from compressed BitSet alternatives. See for example:
https://github.com/lemire/javaewah
http://roaringbitmap.org/