Java创建字节数组,其大小由long表示

发布于 2024-07-26 11:59:37 字数 397 浏览 2 评论 0原文

我正在尝试创建一个大小为 long 类型的字节数组。 例如,可以将其视为:

long x = _________;
byte[] b = new byte[x]; 

显然,您只能为字节数组的大小指定 int

在有人问我为什么需要这么大的字节数组之前,我会说我需要封装我没有编写的消息格式的数据,并且这些消息类型之一的长度为 unsigned int (long< /code> 在 Java 中)。

有没有办法创建这个字节数组?

我在想如果没有办法解决它,我可以创建一个字节数组输出流并继续向其提供字节,但我不知道字节数组的大小是否有任何限制......

I'm trying to create a byte array whose size is of type long. For example, think of it as:

long x = _________;
byte[] b = new byte[x]; 

Apparently you can only specify an int for the size of a byte array.

Before anyone asks why I would need a byte array so large, I'll say I need to encapsulate data of message formats that I am not writing, and one of these message types has a length of an unsigned int (long in Java).

Is there a way to create this byte array?

I am thinking if there's no way around it, I can create a byte array output stream and keep feeding it bytes, but I don't know if there's any restriction on a size of a byte array...

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

无言温柔 2024-08-02 11:59:37

(对于 OP 来说可能有点晚了,但对其他人可能仍然有用)

不幸的是,Java 不支持具有超过 231−1 元素的数组。 对于 byte[] 数组,最大消耗空间为 2 GiB;对于 long[] 数组,最大消耗空间为 16 GiB。

虽然它可能不适用于这种情况,但如果数组将是稀疏,您可能会能够摆脱使用关联数据结构,例如 Map 将每个使用的偏移量与适当的值匹配。 此外,Trove 提供了比标准 Java 集合更节省内存的实现来存储原始值。

如果数组不是稀疏的,并且您确实确实需要内存中的整个 blob,则您可能必须使用二维结构,例如使用 Map 将偏移量模 1024 匹配到正确的 1024 - 字节数组。 即使对于稀疏数组,这种方法也可能具有更高的内存效率,因为相邻的填充单元可以共享相同的 Map 条目。

(It is probably a bit late for the OP, but it might still be useful for others)

Unfortunately Java does not support arrays with more than 231−1 elements. The maximum consumption is 2 GiB of space for a byte[] array, or 16 GiB of space for a long[] array.

While it is probably not applicable in this case, if the array is going to be sparse, you might be able to get away with using an associative data structure like a Map to match each used offset to the appropriate value. In addition, Trove provides an more memory-efficient implementation for storing primitive values than standard Java collections.

If the array is not sparse and you really, really do need the whole blob in memory, you will probably have to use a two-dimensional structure, e.g. with a Map matching offsets modulo 1024 to the proper 1024-byte array. This approach might be be more memory efficient even for sparse arrays, since adjacent filled cells can share the same Map entry.

国际总奸 2024-08-02 11:59:37

大小为最大 32 位有符号整数的 byte[] 将需要 2GB 的连续地址空间。 您不应该尝试创建这样的数组。 否则,如果大小实际上不是那么大(并且它只是一个更大的类型),您可以安全地将其转换为 int 并使用它来创建数组。

A byte[] with size of the maximum 32-bit signed integer would require 2GB of contiguous address space. You shouldn't try to create such an array. Otherwise, if the size is not really that large (and it's just a larger type), you could safely cast it to an int and use it to create the array.

假装不在乎 2024-08-02 11:59:37

您可能应该使用一个流来读取数据,并使用另一个流来写出数据。 如果您稍后需要访问文件中的数据,请将其保存。 如果您需要访问尚未遇到的内容,则需要一个双遍系统,您可以在其中运行一次并存储“第二遍所需的内容,然后再次运行”。

编译器就是这样工作的。

一次加载整个数组的唯一情况是您必须重复随机访问整个数组中的许多位置。 如果是这种情况,我建议您将其加载到多个字节数组中,所有字节数组都存储在单个容器类中。

容器类将有一个字节数组的数组,但从外部看,所有访问似乎都是连续的。 您只需要求字节 49874329128714391837 ,您的类会将您的 Long 除以每个字节数组的大小来计算要访问的数组,然后使用余数来确定字节。

它还可以具有存储和检索“块”的方法,这些“块”可能跨越字节数组边界,这需要创建临时副本 - 但创建一些临时数组的成本将远远超过您不这样做的事实没有分配锁定的 2GB 空间,我认为这可能会破坏您的性能。

编辑:ps。 如果您确实需要随机访问并且无法使用流,那么实现包含类是一个非常好的主意。 它将允许您动态地将实现从单个字节数组更改为一组字节数组,再更改为基于文件的系统,而无需对其余代码进行任何更改。

You should probably be using a stream to read your data in and another to write it out. If you are gong to need access to data later on in the file, save it. If you need access to something you haven't ran into yet, you need a two-pass system where you run through once and store the "stuff you'll need for the second pass, then run through again".

Compilers work this way.

The only case for loading in the entire array at once is if you have to repeatedly randomly access many locations throughout the array. If this is the case, I suggest you load it into multiple byte arrays all stored in a single container class.

The container class would have an array of byte arrays, but from outside all the accesses would seem contiguous. You would just ask for byte 49874329128714391837 and your class would divide your Long by the size of each byte array to calculate which array to access, then use the remainder to determine the byte.

It could also have methods to store and retrieve "Chunks" that could span byte-array boundaries that would require creating a temporary copy--but the cost of creating a few temporary arrays would be more than made up for by the fact that you don't have a locked 2gb space allocated which I think could just destroy your performance.

Edit: ps. If you really need the random access and can't use streams then implementing a containing class is a Very Good Idea. It will let you change the implementation on the fly from a single byte array to a group of byte arrays to a file-based system without any change to the rest of your code.

一抹微笑 2024-08-02 11:59:37

它不会立即提供帮助,但创建更大尺寸的数组(通过长整型)是 Java 7 提议的语言更改。查看 Project Coin 提案以获取更多信息

It's not of immediate help but creating arrays with larger sizes (via longs) is a proposed language change for Java 7. Check out the Project Coin proposals for more info

私野 2024-08-02 11:59:37

“存储”数组的一种方法是将其写入文件,然后使用 RandomAccessFile 访问它(如果需要像数组一样访问它)。 该文件的 api 使用 long 而不是 int 作为文件索引。 它会慢一些,但对记忆的影响要小得多。

这是当您在初始输入扫描期间无法提取所需内容时的情况。

One way to "store" the array is to write it to a file and then access it (if you need to access it like an array) using a RandomAccessFile. The api for that file uses long as an index into file instead of int. It will be slower, but much less hard on the memory.

This is when you can't extract what you need during the initial input scan.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文