使每年包含 12 个月的大型数据集
我有一个关于鲑鱼养殖(2005-2020)的变量的大型数据框。它包含来自数百个不同农场 (org_anonym) 长达 15 年的数据。然而,许多农场缺少某些月份或有重复的月份。我怎样才能这样写,以便每个位置每年都有 12 个月(顺序为 1-12)?
示例:
在此示例中,农场 126 缺少 2005 年的第 12 个月,而 2006 年只有第 11 个月和第 12 个月。有时,同一年有两个连续的行与同一月。
我期望的结果是让所有地点的年份为 2005-2020 年,月份为 1-12,没有重复或缺失月份(填充行中的数据可以为 0 或 NA)。
我没有直观的方法来做到这一点,因为错误是随机的。
请帮忙:)
I have a large data.frame of variables regarding salmon farming (2005-2020). It contains data from hundreds of different farms (org_anonym) for all 15 years. However, many farms are missing some months or have duplicate months. How can I write this so that every year for every location has 12 months in the order 1-12?
Example:
In this example, farm 126 is missing the 12th month of the year for 2005, whereas 2006 has only the 11th and 12th month. Sometimes the same year has two consecutive rows with the same month.
My desired outcome is to have all locations have years 2005-2020 with months 1-12 without duplicates or missing months (the data in the filled rows can be 0 or NA).
I don't have an intuitive way of doing this since the errors are random.
Please help :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
如果您希望
org_acronym
也被保留,只需将上面更改为distinct(year, acronym)
即可。如果您想要 2005:2020 以来的所有年份,只需将上面更改为
输出:
输入:
If you want
org_acronym
to also be carried over, just change above todistinct(year, acronym)
.If you want all the years from 2005:2020, just change above to
Output:
Input:
这是一个类似的示例,其中我只使用“6 个月的年份”,因此更具可读性。通过小例子对内容进行排序会更容易。
更短的方法是使用函数
complete()
,但这要求您的数据帧至少每年和每月出现一次。在我的示例中,这并不完全有效,因为我没有任何带有“第六”个月的年份。另外,
complete()
只是expand()
和join()
的包装,因此您最好了解第一个过程中发生的情况解决方案。here is a similar example where I only work with a "6 month year" so it's more readible. It's easier to sort stuff with small examples.
A shorter way is to use the function
complete()
, but this require that your dataframe has an occurence of at least each year and month. In my example this won't exactly work since I don't have any year with the "sixth" month.Also
complete()
is only a wrapper aroundexpand()
andjoin()
, so it's better for you to understand what happens during the first solution.