如何按日期过滤 numpy.ndarray?

发布于 2024-09-16 06:18:50 字数 763 浏览 3 评论 0原文

我有一个 2d numpy.array,其中第一列包含 datetime.datetime 对象,第二列包含整数:

A = array([[2002-03-14 19:57:38, 197],
       [2002-03-17 16:31:33, 237],
       [2002-03-17 16:47:18, 238],
       [2002-03-17 18:29:31, 239],
       [2002-03-17 20:10:11, 240],
       [2002-03-18 16:18:08, 252],
       [2002-03-23 23:44:38, 327],
       [2002-03-24 09:52:26, 334],
       [2002-03-25 16:04:21, 352],
       [2002-03-25 18:53:48, 353]], dtype=object)

我想要做的是选择特定日期的所有行,例如

A[first_column.date()==datetime.date(2002,3,17)]
array([[2002-03-17 16:31:33, 237],
           [2002-03-17 16:47:18, 238],
           [2002-03-17 18:29:31, 239],
           [2002-03-17 20:10:11, 240]], dtype=object)

如何实现此目的?

感谢您的见解:)

I have a 2d numpy.array, where the first column contains datetime.datetime objects, and the second column integers:

A = array([[2002-03-14 19:57:38, 197],
       [2002-03-17 16:31:33, 237],
       [2002-03-17 16:47:18, 238],
       [2002-03-17 18:29:31, 239],
       [2002-03-17 20:10:11, 240],
       [2002-03-18 16:18:08, 252],
       [2002-03-23 23:44:38, 327],
       [2002-03-24 09:52:26, 334],
       [2002-03-25 16:04:21, 352],
       [2002-03-25 18:53:48, 353]], dtype=object)

What I would like to do is select all rows for a specific date, something like

A[first_column.date()==datetime.date(2002,3,17)]
array([[2002-03-17 16:31:33, 237],
           [2002-03-17 16:47:18, 238],
           [2002-03-17 18:29:31, 239],
           [2002-03-17 20:10:11, 240]], dtype=object)

How can I achieve this?

Thanks for your insight :)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

萌无敌 2024-09-23 06:18:50

您可以这样做:

from_date=datetime.datetime(2002,3,17,0,0,0)
to_date=from_date+datetime.timedelta(days=1)
idx=(A[:,0]>from_date) & (A[:,0]<=to_date)
print(A[idx])
# array([[2002-03-17 16:31:33, 237],
#        [2002-03-17 16:47:18, 238],
#        [2002-03-17 18:29:31, 239],
#        [2002-03-17 20:10:11, 240]], dtype=object)

A[:,0]A 的第一列。

不幸的是,将 A[:,0]datetime.date 对象进行比较会引发 TypeError。但是,与 datetime.datetime 对象进行比较是有效的:

In [63]: A[:,0]>datetime.datetime(2002,3,17,0,0,0)
Out[63]: array([False,  True,  True,  True,  True,  True,  True,  True,  True,  True], dtype=bool)

此外,不幸的是,

datetime.datetime(2002,3,17,0,0,0)<A[:,0]<=datetime.datetime(2002,3,18,0,0,0)

也会引发 TypeError,因为这会调用 datetime.datetime__lt__方法而不是 numpy 数组的 __lt__ 方法。也许这是一个错误。

无论如何,解决这个问题并不难;你可以说,

In [69]: (A[:,0]>datetime.datetime(2002,3,17,0,0,0)) & (A[:,0]<=datetime.datetime(2002,3,18,0,0,0))
Out[69]: array([False,  True,  True,  True,  True, False, False, False, False, False], dtype=bool)

既然这给了你一个布尔数组,你可以将它用作 A 的“花哨索引”,这会产生所需的结果。

You could do this:

from_date=datetime.datetime(2002,3,17,0,0,0)
to_date=from_date+datetime.timedelta(days=1)
idx=(A[:,0]>from_date) & (A[:,0]<=to_date)
print(A[idx])
# array([[2002-03-17 16:31:33, 237],
#        [2002-03-17 16:47:18, 238],
#        [2002-03-17 18:29:31, 239],
#        [2002-03-17 20:10:11, 240]], dtype=object)

A[:,0] is the first column of A.

Unfortunately, comparing A[:,0] with a datetime.date object raises a TypeError. However, comparison with a datetime.datetime object works:

In [63]: A[:,0]>datetime.datetime(2002,3,17,0,0,0)
Out[63]: array([False,  True,  True,  True,  True,  True,  True,  True,  True,  True], dtype=bool)

Also, unfortunately,

datetime.datetime(2002,3,17,0,0,0)<A[:,0]<=datetime.datetime(2002,3,18,0,0,0)

raises a TypeError too, since this calls datetime.datetime's __lt__ method instead of the numpy array's __lt__ method. Perhaps this is a bug.

Anyway, it's not hard to work-around; you can say

In [69]: (A[:,0]>datetime.datetime(2002,3,17,0,0,0)) & (A[:,0]<=datetime.datetime(2002,3,18,0,0,0))
Out[69]: array([False,  True,  True,  True,  True, False, False, False, False, False], dtype=bool)

Since this gives you a boolean array, you can use it as a "fancy index" to A, which yields the desired result.

败给现实 2024-09-23 06:18:50
from datetime import datetime as dt, timedelta as td
import numpy as np

# Create 2-d numpy array
d1 = dt.now()
d2 = dt.now()
d3 = dt.now() - td(1)
d4 = dt.now() - td(1)
d5 = d1 + td(1)
arr = np.array([[d1, 1], [d2, 2], [d3, 3], [d4, 4], [d5, 5]])

# Here we will extract all the data for today, so get date range in datetime
dtx = d1.replace(hour=0, minute=0, second=0, microsecond=0)
dty = dtx + td(hours=24)

# Condition 
cond = np.logical_and(arr[:, 0] >= dtx, arr[:, 0] < dty)

# Full array
print arr
# Extracted array for the range
print arr[cond, :]
from datetime import datetime as dt, timedelta as td
import numpy as np

# Create 2-d numpy array
d1 = dt.now()
d2 = dt.now()
d3 = dt.now() - td(1)
d4 = dt.now() - td(1)
d5 = d1 + td(1)
arr = np.array([[d1, 1], [d2, 2], [d3, 3], [d4, 4], [d5, 5]])

# Here we will extract all the data for today, so get date range in datetime
dtx = d1.replace(hour=0, minute=0, second=0, microsecond=0)
dty = dtx + td(hours=24)

# Condition 
cond = np.logical_and(arr[:, 0] >= dtx, arr[:, 0] < dty)

# Full array
print arr
# Extracted array for the range
print arr[cond, :]
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文