基于与外部数组交集的 numpy rearray 索引

发布于 2024-10-21 01:33:23 字数 930 浏览 10 评论 0原文

我试图根据 recarrays 字段之一和外部数组之间的通用值对 numpy.recarray 中的记录进行子集化。例如，

a = np.array([(10, 'Bob', 145.7), (20, 'Sue', 112.3), (10, 'Jim', 130.5)],
        dtype=[('id', 'i4'), ('name', 'S10'), ('weight', 'f8')])
a = a.view(np.recarray)

b = np.array([10,30])

我想取 a.id 和 b 的交集来确定从记录中提取哪些记录，以便我返回：

(10, 'Bob', 145.7)
(10, 'Jim', 130.5)

我天真地尝试过：

common = np.intersect1d(a.id, b)
subset = a[common]

但是这当然行不通，因为没有 a[10]。我还尝试通过在 id 字段和索引之间创建一个反向字典并从那里进行子集化来实现此目的，例如，

id_x_index = {}
ids = a.id
indexes = np.arange(a.size)
for (id, index) in zip(ids, indexes):
    id_x_index[id] = index

subset_indexes = np.sort([id_x_index[x] for x in ids if x in b])
print a[subset_indexes]

但是如果 a.id 有重复项，我将覆盖 id_x_index 中的字典值，在这种情况下我得到

(10, “吉姆”，130.5)
(10, 'Jim', 130.5)

我知道我忽略了一些简单的方法来将适当的索引放入重新数组中。感谢您的帮助。

原文

I'm trying to subset the records in a numpy.recarray based on the common values between one of the recarrays fields and an external array. For example,

a = np.array([(10, 'Bob', 145.7), (20, 'Sue', 112.3), (10, 'Jim', 130.5)],
        dtype=[('id', 'i4'), ('name', 'S10'), ('weight', 'f8')])
a = a.view(np.recarray)

b = np.array([10,30])

I want to take the intersection of a.id and b to determine what records to pull from the recarray, so that I get back:

(10, 'Bob', 145.7)
(10, 'Jim', 130.5)

Naively, I tried:

common = np.intersect1d(a.id, b)
subset = a[common]

but of course that doesn't work because there is no a[10]. I also tried to do this by creating a reverse dict between the id field and the index and subsetted from there, e.g.

id_x_index = {}
ids = a.id
indexes = np.arange(a.size)
for (id, index) in zip(ids, indexes):
    id_x_index[id] = index

subset_indexes = np.sort([id_x_index[x] for x in ids if x in b])
print a[subset_indexes]

but then I'm overriding dict values in id_x_index if a.id has duplicates, as in this case I get

(10, 'Jim', 130.5)
(10, 'Jim', 130.5)

I know I'm overlooking some simple way to get the appropriate indices into the recarray. Thanks for help.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

带上头具痛哭 2024-10-28 01:33:23

在 Numpy 中执行此操作的最简洁方法是

subset = a[np.in1d(a.id, b)]

The most concise way to do this in Numpy is

subset = a[np.in1d(a.id, b)]

回复收藏 0 原文

一抹淡然 2024-10-28 01:33:23

对于那些拥有旧版本 numpy 的人，你也可以这样做：

subset = a[np.array([i in b for i in a.id])]

And for those who have an older version of numpy, you can also do it this way:

subset = a[np.array([i in b for i in a.id])]

回复收藏 0 原文

~没有更多了~

关于作者

无所谓啦

暂无简介

文章

25 人气

关注发私信

友情链接

文江博客

基于与外部数组交集的 numpy rearray 索引

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

西西弗的石头怪

5397313

烟沫凡尘

一个破名字

萌︼了一个春

当爱已成负担

友情链接

基于与外部数组交集的 numpy rearray 索引

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

西西弗的石头怪

5397313

烟沫凡尘

一个破名字

萌︼了一个春

当爱已成负担

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。