如何选择 R 数据框中满足特定条件的第一行?
如何选择满足特定条件的 R 数据框的第一行?
上下文如下:
我有一个包含五列的数据框:
"pixel", "year","propvar", "component", "cumsum."
有 1,225 种 pixel
和 year
组合,因为数据是根据 49 个地理区域的年度时间序列计算得出的25 个研究年中每年的像素。在每个像素年中,我计算了 propvar
,即给定像素年时间序列的快速傅里叶变换的给定分量所解释的总方差的比例。然后我计算了 cumsum,它是像素年内每个频率分量的 propvar 累积和。 component
列仅提供傅立叶级数分量的索引(加 1),从中计算 propvar
。
我想确定解释大于 99% 的方差所需的成分数量。我认为实现此目的的一种方法是找到每个像素年中的第一行,其中 cumsum
> > 0.99,并从中创建一个包含三列的数据框:pixel
、year
和 numbercomps
,其中 numbercomps
是在给定像素年内解释大于 99% 的方差所需的分量数量。我不知道如何在 R 中执行此操作。有人有解决方案吗?
How do I select the first row of an R data frame that meets certain criteria?
Here is the context:
I have a data frame with five columns:
"pixel", "year","propvar", "component", "cumsum."
There are 1,225 combinations of pixel
and year
, because the data was computed from the annual time series of 49 geographic pixels for each of 25 study years. Within each pixel-year, I have computed propvar
, the proportion of total variance explained by a given component of the fast Fourier transform for the time series of a given pixel-year. I then computed cumsum
, which is the cumulative sum of propvar
for each frequency component within a pixel-year. The component
column just gives you an index for the Fourier series component (plus 1) from which propvar
was calculated.
I want to determine the number of components required to explain greater than 99% of the variance. I figure one way to do this is to find the first row within each pixel-year where cumsum
> 0.99, and create a data frame from it with three columns, pixel
, year
, and numbercomps
, where numbercomps
is the number of components required within a given pixel-year to explain greater than 99% of the variance. I do not know how to do this in R. Does anyone have a solution?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
当然。像这样的东西应该可以解决问题:
EDIT 另外,对于那些对
data.table
感兴趣的人,有这样的:Sure. Something like this should do the trick:
EDIT Also, for those interested in
data.table
, there is this:假设 df 是数据集,我们必须从中选择满足条件的第一行。
这段两行代码将为您提供所需的行。
Let's assume df is the dataset from which we have to select the first row that meets the criteria.
This two-line code will give you the required row.