根据值的唯一性删除 numpy 数组的行

发布于 2024-07-12 11:13:13 字数 359 浏览 6 评论 0原文

假设我有一个像这样的二维数组,

numpy.array(
    [[0,1,1.2,3],
    [1,5,3.2,4],
    [3,4,2.8,4], 
    [2,6,2.3,5]])

我希望形成一个数组,根据最后一列值的唯一性消除整行,根据第三列值选择要保留的行。 例如,在这种情况下,我只想保留最后一列为 4 的行,并选择具有第三列次要值的行,结果如下:

array([0,1,1.2,3],
      [3,4,2.8,4],
      [2,6,2.3,5])

从而消除行 [1,5,3.2 ,4]

哪一个是最好的方法?

let's say I have a bi-dimensional array like that

numpy.array(
    [[0,1,1.2,3],
    [1,5,3.2,4],
    [3,4,2.8,4], 
    [2,6,2.3,5]])

I want to have an array formed eliminating whole rows based on uniqueness of values of last column, selecting the row to keep based on value of third column.
e.g. in this case i would like to keep only one of the rows with 4 as last column, and choose the one which has the minor value of third column, having something like that as a result:

array([0,1,1.2,3],
      [3,4,2.8,4],
      [2,6,2.3,5])

thus eliminating row [1,5,3.2,4]

which would be the best way to do it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

浮光之海 2024-07-19 11:13:13

我的 numpy 已经过时了,但这应该可行:

#keepers is a dictionary of type int: (int, int)
#the key is the row's final value, and the tuple is (row index, row[2])
keepers = {}
deletions = []
for i, row in enumerate(n):
    key = row[3]
    if key not in keepers:
        keepers[key] = (i, row[2])
    else:
        if row[2] > keepers[key][1]:
            deletions.append(i)
        else:
            deletions.append(keepers[key][0])
            keepers[key] = (i, row[2])
o = numpy.delete(n, deletions, axis=0)

我已经从我的声明性解决方案中大大简化了它,这变得非常笨拙。 希望这更容易理解; 我们所做的就是维护一个我们想要保留的值的字典和一个我们想要删除的索引列表。

My numpy is way out of practice, but this should work:

#keepers is a dictionary of type int: (int, int)
#the key is the row's final value, and the tuple is (row index, row[2])
keepers = {}
deletions = []
for i, row in enumerate(n):
    key = row[3]
    if key not in keepers:
        keepers[key] = (i, row[2])
    else:
        if row[2] > keepers[key][1]:
            deletions.append(i)
        else:
            deletions.append(keepers[key][0])
            keepers[key] = (i, row[2])
o = numpy.delete(n, deletions, axis=0)

I've greatly simplified it from my declarative solution, which was getting quite unwieldy. Hopefully this is easier to follow; all we do is maintain a dictionary of values that we want to keep and a list of indexes we want to delete.

慕烟庭风 2024-07-19 11:13:13

这可以在 Numpy 中通过组合 lexsortunique 来有效实现,如下所示

import numpy as np

a = np.array([[0, 1, 1.2, 3], 
              [1, 5, 3.2, 4],
              [3, 4, 2.8, 4], 
              [2, 6, 2.3, 5]])

# Sort by last column and 3rd column when values are equal
j = np.lexsort(a.T)

# Find first occurrence (=smallest 3rd column) of unique values in last column
k = np.unique(a[j, -1], return_index=True)[1]

print(a[j[k]])

这将返回所需的结果

[[ 0.   1.   1.2  3. ]
 [ 3.   4.   2.8  4. ]
 [ 2.   6.   2.3  5. ]]

This can be achieved efficiently in Numpy by combining lexsort and unique as follows

import numpy as np

a = np.array([[0, 1, 1.2, 3], 
              [1, 5, 3.2, 4],
              [3, 4, 2.8, 4], 
              [2, 6, 2.3, 5]])

# Sort by last column and 3rd column when values are equal
j = np.lexsort(a.T)

# Find first occurrence (=smallest 3rd column) of unique values in last column
k = np.unique(a[j, -1], return_index=True)[1]

print(a[j[k]])

This returns the desired result

[[ 0.   1.   1.2  3. ]
 [ 3.   4.   2.8  4. ]
 [ 2.   6.   2.3  5. ]]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文