连接不共享时间戳值的数据集

发布于 2025-01-19 22:46:37 字数 720 浏览 1 评论 0 原文

我有两个不同的 csv 文件,分别对应于一个人的 HRV (csv no1) 和他们的情绪 (csv no2)。第一个数据集使用 UNIX 时间戳来捕获 HRV 值,另一个数据集记录人们每 5 秒观察自己时的情绪。

由于情绪每五秒捕获一次,HRV 值每秒捕获一次, 我想迭代 HRV 值数据集的行并创建一个新的数据集(或者只是一个新列,无论有效),其中包含每组 5 行的平均总和。 例如,前 5 行的平均值对应于该情绪,接下来的 5 行对应于其他情绪等。

我想这样做,以便最终能够将它们相互链接。

关于如何做到这一点有什么想法吗?

不幸的是,我无法提供易于复制的代码片段,因为该数据集不是我共享的,但是,我可以通过一些屏幕截图指出我的数据集的外观:

这是具有 HRV 值的数据集:

输入图像描述这里

这是带有情感值的数据集: 输入图片此处描述

I have two different csv files that correspond to a person's HRV (csv no1) and their emotions (csv no2). The first dataset used UNIX timestamps to capture the HRV values and the other recorded the person's emotions while they were watching themselves every 5 seconds.

Since the emotions are captured every five seconds and the HRV values are captured every second,
I want to iterate through the rows of the HRV values dataset and create a new one (or just a new column, whatever works) that contains the average sum of each set of 5 rows.
For example the mean value of the first 5 rows corresponds to that emotion, the next 5 rows correspond to that other emotion etc.

I want to do that so I can eventually be able to link them with each other.

Any ideas on how to do that?

Unfortunately, I am not able to provide an easily-reproduced code snippet since the dataset is not mine to share, however, I can point out with a few screenshots how my datasets look:

This is the dataset with the HRV values:

enter image description here

And this is the dataset with the emotion values:
enter image description here

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

谈下烟灰 2025-01-26 22:46:37

如果您可以提供数据进行测试,那就很好。
I create data with the next code:

dates = pd.date_range('10-01-2016', periods=50, freq='S')
df = pd.DataFrame({'value': 100 + np.random.randint(-5, 10, 50).cumsum()},index=dates)
df.head()

enter image description here

I think that 重新样本 pandas可能很有用。查看“ nofollow noreferrer”> offset别名

df.resample('5S').mean().head()

Note that in my example the timestamp is the index, also, I use the mean as the value to pass, but I don't really know what you would like to use.之后,您可以合并数据。

It would be good if you could provide data to test even if it is not real.
I create data with the next code:

dates = pd.date_range('10-01-2016', periods=50, freq='S')
df = pd.DataFrame({'value': 100 + np.random.randint(-5, 10, 50).cumsum()},index=dates)
df.head()

enter image description here

I think that resample from pandas could be useful. Review the Offset aliases in the documentation.

df.resample('5S').mean().head()

enter image description here

Note that in my example the timestamp is the index, also, I use the mean as the value to pass, but I don't really know what you would like to use. After this, you could just merge the data.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文