在Java中转换为面向列的数组

发布于 2024-08-30 13:24:20 字数 762 浏览 10 评论 0原文

虽然我的标题中有 Java，但这可以适用于任何 OO 语言。我想知道一些新想法来提高我正在尝试做的事情的性能。

我有一个不断接收 Object[] 数组的方法。我需要通过多个数组（列表或其他）拆分此数组中的对象，以便该方法接收的所有数组的每一列都有一个独立的列表。

示例：

List<List<Object>> column-oriented = new ArrayList<ArrayList<Object>>();

public void newObject(Object[] obj) {
    for(int i = 0; i < obj.length; i++) {
        column-oriented.get(i).add(obj[i]);
    }
}

注意：为了简单起见，我省略了对象和内容的初始化。

我上面显示的代码当然很慢。我已经尝试过一些其他的事情，但想听听一些新的想法。

知道它对性能非常敏感，您将如何做到这一点？

编辑：

我测试了一些东西，发现：

我没有使用 ArrayList （或任何其他 Collection），而是将 Object[] 数组包装在另一个对象中来存储各个列。如果该数组达到其容量，我将创建另一个大小为两倍的数组，并使用 System.copyArray 将内容从一个数组复制到另一个数组。令人惊讶的是（至少对我来说）这比使用 ArrayList 存储内部列更快......

原文

Although I have Java in the title, this could be for any OO language.
I'd like to know a few new ideas to improve the performance of something I'm trying to do.

I have a method that is constantly receiving an Object[] array. I need to split the Objects in this array through multiple arrays (List or something), so that I have an independent list for each column of all arrays the method receives.

Example:

List<List<Object>> column-oriented = new ArrayList<ArrayList<Object>>();

public void newObject(Object[] obj) {
    for(int i = 0; i < obj.length; i++) {
        column-oriented.get(i).add(obj[i]);
    }
}

Note: For simplicity I've omitted the initialization of objects and stuff.

The code I've shown above is slow of course. I've already tried a few other things, but would like to hear some new ideas.

How would you do this knowing it's very performance sensitive?

EDIT:

I've tested a few things and found that:

Instead of using ArrayList (or any other Collection), I wrapped an Object[] array in another object to store individual columns. If this array reaches its capacity, I create another array with twice de size and copy the contents from one to another using System.copyArray. Surprisingly (at least for me) this is faster that using ArrayList to store the inner columns...

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

白云不回头 2024-09-06 13:24:21

答案取决于数据和使用情况。您在此类集合中拥有多少数据？读/写的比例是多少（添加对象数组）？这会影响内部列表的结构更好以及许多其他可能的优化。

复制数据的最快方法是完全避免复制。如果您知道调用者代码不会进一步修改 obj 数组（这是重要条件），则可能的技巧之一是实现自定义的 List 类以用作内部列表。在内部，您将存储共享的List。每次调用我们只是将新数组添加到该列表中。自定义内部列表类将知道它代表哪一列（设为n），并且当要求在位置m处提供项目时，它将转置m< /code> 和 n 并查询内部结构以获得 internalArray.get(m)[n]。这种实现是不安全的，因为调用者的限制很容易被忘记，但在某些条件下可能会更快（但是，在其他条件下这可能会更慢）。

回复收藏 0 原文

水中月 2024-09-06 13:24:21

我会尝试使用 LinkedList 作为内部列表，因为它应该具有更好的插入性能。也许将 Object arra 包装到集合中并使用 addAll 也可能有所帮助。

回复收藏 0 原文

剑心龙吟 2024-09-06 13:24:21

由于复制数组，ArrayList 可能会很慢（它使用与您自己编写的集合类似的方法）。

作为替代解决方案，您可以尝试首先简单地存储行，并在必要时创建列。这样，列表中内部数组的复制就会减少到最少。

示例：

//Notice: You can use a LinkedList for rows, as no index based access is used.
List<Object[]> rows =... 

List<List<Object>> columns;

public void processColumns() {
  columns = new ArrayList<List<Object>>();
  for(Object[] aRow : rows){

    while (aRow.size() > columns.size()){
      //This ensures that the ArrayList is big enough, so no copying is necessary
      List<Object> newColumn = new ArrayList<Object>(rows.size())
      columns.add(newColumn); 
    }

    for (int i = 0; i < aRow.length; i++){
      columns.get(i).add(aRow[i]);
    }
  }
}

根据列数，外部列表仍然有可能在内部复制数组，但普通表包含的行数远多于列数，因此它应该只是一个小数组。

ArrayList may be slow, due to copying of arrays (It uses a similar approach as your self-written collection).

As an alternate solution you could try to simply store the Rows at first and create columns when neccessary. This way, copying of the internal arrays at the list is reduced to a minimum.

Example:

//Notice: You can use a LinkedList for rows, as no index based access is used.
List<Object[]> rows =... 

List<List<Object>> columns;

public void processColumns() {
  columns = new ArrayList<List<Object>>();
  for(Object[] aRow : rows){

    while (aRow.size() > columns.size()){
      //This ensures that the ArrayList is big enough, so no copying is necessary
      List<Object> newColumn = new ArrayList<Object>(rows.size())
      columns.add(newColumn); 
    }

    for (int i = 0; i < aRow.length; i++){
      columns.get(i).add(aRow[i]);
    }
  }
}

Depending on the number of columns, it's still possible that the outer list is copying arrays internally, but normal tables contains far more rows than columns, so it should be a small array only.

回复收藏 0 原文