如何绘制多个每日时间序列,在指定的触发时间对齐?
问题:
我有一个数据框df
,如下所示:
value msg_type
date
2022-03-15 08:15:10+00:00 122 None
2022-03-15 08:25:10+00:00 125 None
2022-03-15 08:30:10+00:00 126 None
2022-03-15 08:30:26.542134+00:00 127 ANNOUNCEMENT
2022-03-15 08:35:10+00:00 128 None
2022-03-15 08:40:10+00:00 122 None
2022-03-15 08:45:09+00:00 127 None
2022-03-15 08:50:09+00:00 133 None
2022-03-15 08:55:09+00:00 134 None
....
2022-03-16 09:30:09+00:00 132 None
2022-03-16 09:30:13.234425+00:00 135 ANNOUNCEMENT
2022-03-16 09:35:09+00:00 130 None
2022-03-16 09:40:09+00:00 134 None
2022-03-16 09:45:09+00:00 135 None
2022-03-16 09:50:09+00:00 134 None
value
数据大约每隔5分钟出现一次,但消息可以在任何时间出现时间。我试图每天绘制一行 值
,其中 x 轴范围从 t=-2 小时到 t=+8 小时,并且 ANNOUNCMENT
发生在t=0(见下图)。
因此,例如,如果 ANNOUNCMENT
发生在 3/15 上午 8:30,并在 3/16 上午 9:30 再次发生,则应该有两行:
- 3/15 的一行绘制数据从上午 6:30 到下午 4:
- 30,3/16 的一条线绘制从上午 7:30 到下午 5:30 的数据,
两者共享相同的 x 轴,范围为-2 至 +8,在 t=0 时发布公告
。
我尝试过的:
目前我可以通过查找公告的索引位置来做到这一点(例如,假设它出现在第 298 行 -> announcement_index = 298
) ,生成从 -24 到 96 的 120 个数字的数组(每个数字代表 10 小时、5 分钟 -> x = np.arange(-24, 96, 1)),然后绘制
sns.lineplot(x, y=df['value'].iloc[announcement_index-24:announcement_index+96])
虽然这大部分有效(见下图),但我怀疑这不是正确的方法。具体来说,尝试在特定时间向绘图添加更多信息(例如一组不同的“值”标记)很困难,因为我需要将时间戳转换为任意 24-96 值范围。
如何使用日期时间索引来制作相同的图?非常感谢!
The Problem:
I have a dataframe df
that looks like this:
value msg_type
date
2022-03-15 08:15:10+00:00 122 None
2022-03-15 08:25:10+00:00 125 None
2022-03-15 08:30:10+00:00 126 None
2022-03-15 08:30:26.542134+00:00 127 ANNOUNCEMENT
2022-03-15 08:35:10+00:00 128 None
2022-03-15 08:40:10+00:00 122 None
2022-03-15 08:45:09+00:00 127 None
2022-03-15 08:50:09+00:00 133 None
2022-03-15 08:55:09+00:00 134 None
....
2022-03-16 09:30:09+00:00 132 None
2022-03-16 09:30:13.234425+00:00 135 ANNOUNCEMENT
2022-03-16 09:35:09+00:00 130 None
2022-03-16 09:40:09+00:00 134 None
2022-03-16 09:45:09+00:00 135 None
2022-03-16 09:50:09+00:00 134 None
The value
data occurs in roughly 5 minute intervals, but messages can occur at any time. I am trying to plot one line of values
per day, where the x-axis ranges from t=-2 hours to t=+8 hours, and the ANNOUNCEMENT
occurs at t=0 (see image below).
So, for example, if an ANNOUNCEMENT
occurs at 8:30AM on 3/15 and again at 9:30AM on 3/16, there should be two lines:
- one line for 3/15 that plots data from 6:30AM to 4:30PM, and
- one line for 3/16 that plots data from 7:30AM to 5:30PM,
both sharing the same x-axis ranging from -2 to +8, with ANNOUNCEMENT
at t=0.
What I've Tried:
I am able to do this currently by finding the index position of an announcement (e.g. say it occurs at row 298 -> announcement_index = 298
), generating an array of 120 numbers from -24 to 96 (representing 10 hours at 5 minutes per number -> x = np.arange(-24, 96, 1)
), then plotting
sns.lineplot(x, y=df['value'].iloc[announcement_index-24:announcement_index+96])
While this does mostly work (see image below), I suspect it's not the correct way to go about it. Specifically, trying to add more info to the plot (like a different set of 'value' markers) at specific times is difficult because I would need to convert the timestamp into this arbitrary 24-96 value range.
How can I make this same plot but by utilizing the datetime index instead? Thank you so much!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
假设索引已转换
to_datetime
< /a>,创建一个IntervalArray
从索引的 -2H 到 +8H:然后对于每个
ANNOUNCMENT
,从interval.left
绘制窗口> 到interval.right
:ANNOUNCMENT
以来的秒数ANNOUNCMENT
以来的小时数以下是具有较小值的输出窗口 -1H 至 +2H这样我们就可以更清楚地看到小样本数据(完整代码如下):
完整代码:
Assuming the index has already been converted
to_datetime
, create anIntervalArray
from -2H to +8H of the index:Then for each
ANNOUNCEMENT
, plot the window frominterval.left
tointerval.right
:ANNOUNCEMENT
ANNOUNCEMENT
Here is the output with a smaller window -1H to +2H just so we can see the small sample data more clearly (full code below):
Full code: