如何在 R 中添加新列和聚合值
我对 gnuplot 完全陌生,只是因为我需要学习它而尝试这个。我有一个三列值,其中第一列代表文件名(日期和时间,一小时间隔),其余两列代表两个不同的实体 Prop1 和 Prop2。
Datetime Prop1 Prop2
20110101_0000.txt 2 5
20110101_0100.txt 2 5
20110101_0200.txt 2 5
...
20110101_2300.txt 2 5
20110201_0000.txt 2 5
20110101_0100.txt 2 5
...
20110201_2300.txt 2 5
...
我需要按一天中的小时(**_0100)(即最后四位数字)聚合数据。因此,我想创建另一个名为“小时”的列,它告诉我一天中的时间。这意味着 0000 = 0h, 0100 = 1h, ...... 2200 = 22h
等等。
然后我想得到每小时 Prop1 和 Prop2 的总和,所以最后得到类似的东西。
Hour Prop1 Prop2
0h 120 104
1h 230 160
...
10h 90 110
...
23h 100 200
并得到 Prop1 和 Prop2 的线图。
I am completely new to gnuplot and am only trying this because I need to learn it. I have a values in three columns where the first represents the filename (date and time, one hour interval) and the remaining two columns represent two different entities Prop1 and Prop2.
Datetime Prop1 Prop2
20110101_0000.txt 2 5
20110101_0100.txt 2 5
20110101_0200.txt 2 5
...
20110101_2300.txt 2 5
20110201_0000.txt 2 5
20110101_0100.txt 2 5
...
20110201_2300.txt 2 5
...
I need to aggregate the data by the hour of the day (the **_0100) which is the last four numeric digits. So, I want to create another column called hour which tells me the hour of the day. It means 0000 = 0h, 0100 = 1h, ...... 2200 = 22h
etc.
I then want to get the sum of Prop1 and Prop2 for each hour, so in the end get something like.
Hour Prop1 Prop2
0h 120 104
1h 230 160
...
10h 90 110
...
23h 100 200
and the get a line plot of Prop1 and Prop2.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
gsub 的通用解决方案:
编辑:
您可以使用 Data$Hour <- substr(Data$Hour,1,2) 来获取小时。正如评论中所说,如果您在 Datetime 中始终具有完全相同的结构,则可以立即使用
substr()
:然后您可以使用
aggregate
,tapply< /code>、
by
、...无论做什么都可以。要对 Prop1 和 Prop2 求和,您可以使用聚合,例如:使用数据集:
A general solution with gsub :
EDIT :
You can use
Data$Hour <- substr(Data$Hour,1,2)
to get just the hour. As said in the comments, if you always have exactly the same structure in Datetime, you could usesubstr()
immediately:Then you can use
aggregate
,tapply
,by
, ... whatever to do what you want. To sum both Prop1 and Prop2, you can use aggregate, eg:with the dataset :