相当于 NumPy 中的命名元组？

发布于 2024-12-05 14:04:29 字数 905 浏览 8 评论 0原文

是否可以创建一个行为非常类似于 collections.namedtuple 的 NumPy 对象，从某种意义上说，可以像这样访问元素：

data[1] = 42
data['start date'] = '2011-09-20'  # Slight generalization of what is possible with a namedtuple

我尝试使用复杂的数据类型：

>>> data = numpy.empty(shape=tuple(), dtype=[('start date', 'S11'), ('n', int)])

这将创建一个具有某种命名元组的 0 维值类型;它几乎可以工作：

>>> data['start date'] = '2011-09-20'
>>> data
array(('2011-09-20', -3241474627884561860), 
      dtype=[('start date', '|S11'), ('n', '<i8')])

但是，元素访问不起作用，因为“数组”是0维：

>>> data[0] = '2011-09-20'
Traceback (most recent call last):
  File "<ipython-input-19-ed41131430b9>", line 1, in <module>
    data[0] = '2011-09-20'
IndexError: 0-d arrays can't be indexed.

是否有一种方法可以使用 NumPy 对象获得上述所需的行为（通过字符串和索引进行项目分配）？

原文

Is it possible to create a NumPy object that behaves very much like a collections.namedtuple, in the sense that elements can be accessed like so:

data[1] = 42
data['start date'] = '2011-09-20'  # Slight generalization of what is possible with a namedtuple

I tried to use a complex data type:

>>> data = numpy.empty(shape=tuple(), dtype=[('start date', 'S11'), ('n', int)])

This creates a 0-dimensional value with a kind of namedtuple type; it almost works:

>>> data['start date'] = '2011-09-20'
>>> data
array(('2011-09-20', -3241474627884561860), 
      dtype=[('start date', '|S11'), ('n', '<i8')])

However, element access does not work, because the "array" is 0-dimensional:

>>> data[0] = '2011-09-20'
Traceback (most recent call last):
  File "<ipython-input-19-ed41131430b9>", line 1, in <module>
    data[0] = '2011-09-20'
IndexError: 0-d arrays can't be indexed.

Is there a way of obtaining the desired behavior described above (item assignment through both a string and an index) with a NumPy object?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

无声无音无过去 2024-12-12 14:04:29

您可以使用 numpy.rec 模块执行类似的操作。您需要的是该模块中的 record 类，但我不知道如何直接创建此类的实例。一种间接方法是首先创建一个带有单个条目的 recarray：

>>> a = numpy.recarray(1, names=["start date", "n"], formats=["S11", "i4"])[0]
>>> a[0] = "2011-09-20"
>>> a[1] = 42
>>> a
('2011-09-20', 42)
>>> a["start date"]
'2011-09-20'
>>> a.n
42

如果您知道如何直接创建 record 实例，请告诉我。

You can do something like this using the numpy.rec module. What you need is the record class from this module, but I don't know how to directly create an instance of such a class. One indrect way is to first create a recarray with a single entry:

>>> a = numpy.recarray(1, names=["start date", "n"], formats=["S11", "i4"])[0]
>>> a[0] = "2011-09-20"
>>> a[1] = 42
>>> a
('2011-09-20', 42)
>>> a["start date"]
'2011-09-20'
>>> a.n
42

If you figure out how to create an instance of record directly, please let me know.

回复收藏 0 原文

嗼ふ静 2024-12-12 14:04:29

（编辑为 EOL 建议更具体地回答问题。）

创建 0-dim 数组（我也没有找到标量构造函数。）

>>> data0 = np.array(('2011-09-20', 0), dtype=[('start date', 'S11'), ('n', int)])
>>> data0.ndim
0

访问 0-dim 数组中的元素

>>> type(data0[()])
<class 'numpy.void'>
>>> data0[()][0]
b'2011-09-20'
>>> data0[()]['start date']
b'2011-09-20'

>>> #There is also an item() method, which however returns the element as python type
>>> type(data0.item())
<class 'tuple'>

我认为最简单的是考虑结构化数组（或重新排列）作为元组的列表或数组，索引按选择列的名称和选择行的整数进行索引。

>>> tupleli = [('2011-09-2%s' % i, i) for i in range(5)]
>>> tupleli
[('2011-09-20', 0), ('2011-09-21', 1), ('2011-09-22', 2), ('2011-09-23', 3), ('2011-09-24', 4)]
>>> dt = dtype=[('start date', '|S11'), ('n', np.int64)]
>>> dt
[('start date', '|S11'), ('n', <class 'numpy.int64'>)]

零维数组，元素是元组，即一条记录，更改：不是标量元素，请参见末尾

>>> data1 = np.array(tupleli[0], dtype=dt)
>>> data1.shape
()
>>> data1['start date']
array(b'2011-09-20', 
      dtype='|S11')
>>> data1['n']
array(0, dtype=int64)

数组，其中一个元素

>>> data2 = np.array([tupleli[0]], dtype=dt)
>>> data2.shape
(1,)
>>> data2[0]
(b'2011-09-20', 0)

一维数组

>>> data3 = np.array(tupleli, dtype=dt)
>>> data3.shape
(5,)
>>> data3[2]
(b'2011-09-22', 2)
>>> data3['start date']
array([b'2011-09-20', b'2011-09-21', b'2011-09-22', b'2011-09-23',
       b'2011-09-24'], 
      dtype='|S11')
>>> data3['n']
array([0, 1, 2, 3, 4], dtype=int64)

直接索引到单个记录，与 EOL 的示例相同，我不知道

>>> data3[2][1]
2
>>> data3[2][0]
b'2011-09-22'

>>> data3[2]['n']
2
>>> data3[2]['start date']
b'2011-09-22'

尝试理解 EOL 的示例是否有效：标量元素和零维数组是不同的

>>> type(data1)
<class 'numpy.ndarray'>
>>> type(data1[()])   #get element out of 0-dim array
<class 'numpy.void'>

>>> data1[0]
Traceback (most recent call last):
  File "<pyshell#98>", line 1, in <module>
    data1[0]
IndexError: 0-d arrays can't be indexed
>>> data1[()][0]
b'2011-09-20'

>>> data1.ndim
0
>>> data1[()].ndim
0

（注意：我无意中在开放的 python 3.2 解释器中输入了示例，所以有一个 b'...'）

(edited as EOL's recommended to be more specific in answering the question.)

create 0-dim array (I didn't find a scalar constructor either.)

>>> data0 = np.array(('2011-09-20', 0), dtype=[('start date', 'S11'), ('n', int)])
>>> data0.ndim
0

access element in 0-dim array

>>> type(data0[()])
<class 'numpy.void'>
>>> data0[()][0]
b'2011-09-20'
>>> data0[()]['start date']
b'2011-09-20'

>>> #There is also an item() method, which however returns the element as python type
>>> type(data0.item())
<class 'tuple'>

I think the easiest is to think of structured arrays (or recarrays) as list or arrays of tuples, and indexing works by name which selects column and by integers which selects rows.

>>> tupleli = [('2011-09-2%s' % i, i) for i in range(5)]
>>> tupleli
[('2011-09-20', 0), ('2011-09-21', 1), ('2011-09-22', 2), ('2011-09-23', 3), ('2011-09-24', 4)]
>>> dt = dtype=[('start date', '|S11'), ('n', np.int64)]
>>> dt
[('start date', '|S11'), ('n', <class 'numpy.int64'>)]

zero dimensional array, element is tuple, i.e. one record, changed: is not a scalar element, see at end

>>> data1 = np.array(tupleli[0], dtype=dt)
>>> data1.shape
()
>>> data1['start date']
array(b'2011-09-20', 
      dtype='|S11')
>>> data1['n']
array(0, dtype=int64)

array with one element

>>> data2 = np.array([tupleli[0]], dtype=dt)
>>> data2.shape
(1,)
>>> data2[0]
(b'2011-09-20', 0)

1d array

>>> data3 = np.array(tupleli, dtype=dt)
>>> data3.shape
(5,)
>>> data3[2]
(b'2011-09-22', 2)
>>> data3['start date']
array([b'2011-09-20', b'2011-09-21', b'2011-09-22', b'2011-09-23',
       b'2011-09-24'], 
      dtype='|S11')
>>> data3['n']
array([0, 1, 2, 3, 4], dtype=int64)

direct indexing into a single record, same as in EOL's example that I didn't know it works

>>> data3[2][1]
2
>>> data3[2][0]
b'2011-09-22'

>>> data3[2]['n']
2
>>> data3[2]['start date']
b'2011-09-22'

trying to understand EOL's example: scalar element and zero-dimensional array are different

>>> type(data1)
<class 'numpy.ndarray'>
>>> type(data1[()])   #get element out of 0-dim array
<class 'numpy.void'>

>>> data1[0]
Traceback (most recent call last):
  File "<pyshell#98>", line 1, in <module>
    data1[0]
IndexError: 0-d arrays can't be indexed
>>> data1[()][0]
b'2011-09-20'

>>> data1.ndim
0
>>> data1[()].ndim
0

(Note: I typed the example in an open python 3.2 interpreter by accident, so there is a b'...')

回复收藏 0 原文

苍白女子 2024-12-12 14:04:29

好的，我找到了一个解决方案，但我希望看到一个更优雅的解决方案：

data = numpy.empty(shape=1, dtype=[('start date', 'S11'), ('n', int)])[0]

创建一个包含单个元素的一维数组并获取该元素。这使得访问元素可以使用字符串和数字索引：

>>> data['start date'] = '2011-09-20'  # Contains a space: more flexible than a namedtuple!
>>> data[1] = 123
>>> data
('2011-09-20', 123)

如果有一种方法可以直接构造data，而不必先创建一个包含一个元素的数组并提取该元素，那就太好了。因为

>>> type(data)
<type 'numpy.void'>

我不确定可以调用什么 NumPy 构造函数......（没有 numpy.void 的文档字符串）。

OK, I found a solution, but I would love to see a more elegant one:

data = numpy.empty(shape=1, dtype=[('start date', 'S11'), ('n', int)])[0]

creates a 1-dimensional array with a single element and gets the element. This makes accessing elements work with both strings and numerical indices:

>>> data['start date'] = '2011-09-20'  # Contains a space: more flexible than a namedtuple!
>>> data[1] = 123
>>> data
('2011-09-20', 123)

It would be nice if there was a way of directly constructing data, without having to first create an array with one element and extracting this element. Since

>>> type(data)
<type 'numpy.void'>

I'm not sure what NumPy constructor could be called… (there is no docstring for numpy.void).

回复收藏 0 原文

她说她爱他 2024-12-12 14:04:29

这是通过 Pandas 包中的“Series”很好地实现的。

例如来自教程：

>>> from pandas import *
>>> import numpy as np
>>> s = Series(np.random.randn(5), index=['a', 'b', 'c', 'd', 'e'])
>>> s
a    -0.125628696947
b    0.0942011098937
c    -0.71375003803
d    -0.590085433392
e    0.993157363933
>>> s[1]
0.094201109893723267
>>> s['b']
0.094201109893723267

我刚刚玩了几天，但看起来它有很多东西可以提供。

This is nicely implemented by "Series" in the Pandas package.

For example from the tutorial:

>>> from pandas import *
>>> import numpy as np
>>> s = Series(np.random.randn(5), index=['a', 'b', 'c', 'd', 'e'])
>>> s
a    -0.125628696947
b    0.0942011098937
c    -0.71375003803
d    -0.590085433392
e    0.993157363933
>>> s[1]
0.094201109893723267
>>> s['b']
0.094201109893723267

I've just been playing around with this for a few days, but it looks like it has a lot to offer.

回复收藏 0 原文

~没有更多了~