基于与外部数组交集的 numpy rearray 索引
我试图根据 recarrays 字段之一和外部数组之间的通用值对 numpy.recarray 中的记录进行子集化。例如,
a = np.array([(10, 'Bob', 145.7), (20, 'Sue', 112.3), (10, 'Jim', 130.5)],
dtype=[('id', 'i4'), ('name', 'S10'), ('weight', 'f8')])
a = a.view(np.recarray)
b = np.array([10,30])
我想取 a.id 和 b 的交集来确定从记录中提取哪些记录,以便我返回:
(10, 'Bob', 145.7)
(10, 'Jim', 130.5)
我天真地尝试过:
common = np.intersect1d(a.id, b)
subset = a[common]
但是这当然行不通,因为没有 a[10]。我还尝试通过在 id 字段和索引之间创建一个反向字典并从那里进行子集化来实现此目的,例如,
id_x_index = {}
ids = a.id
indexes = np.arange(a.size)
for (id, index) in zip(ids, indexes):
id_x_index[id] = index
subset_indexes = np.sort([id_x_index[x] for x in ids if x in b])
print a[subset_indexes]
但是如果 a.id 有重复项,我将覆盖 id_x_index 中的字典值,在这种情况下我得到
(10, “吉姆”,130.5)
(10, 'Jim', 130.5)
我知道我忽略了一些简单的方法来将适当的索引放入重新数组中。感谢您的帮助。
I'm trying to subset the records in a numpy.recarray based on the common values between one of the recarrays fields and an external array. For example,
a = np.array([(10, 'Bob', 145.7), (20, 'Sue', 112.3), (10, 'Jim', 130.5)],
dtype=[('id', 'i4'), ('name', 'S10'), ('weight', 'f8')])
a = a.view(np.recarray)
b = np.array([10,30])
I want to take the intersection of a.id and b to determine what records to pull from the recarray, so that I get back:
(10, 'Bob', 145.7)
(10, 'Jim', 130.5)
Naively, I tried:
common = np.intersect1d(a.id, b)
subset = a[common]
but of course that doesn't work because there is no a[10]. I also tried to do this by creating a reverse dict between the id field and the index and subsetted from there, e.g.
id_x_index = {}
ids = a.id
indexes = np.arange(a.size)
for (id, index) in zip(ids, indexes):
id_x_index[id] = index
subset_indexes = np.sort([id_x_index[x] for x in ids if x in b])
print a[subset_indexes]
but then I'm overriding dict values in id_x_index if a.id has duplicates, as in this case I get
(10, 'Jim', 130.5)
(10, 'Jim', 130.5)
I know I'm overlooking some simple way to get the appropriate indices into the recarray. Thanks for help.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
在 Numpy 中执行此操作的最简洁方法是
The most concise way to do this in Numpy is
对于那些拥有旧版本 numpy 的人,你也可以这样做:
And for those who have an older version of numpy, you can also do it this way: