每次返回的密度值
我有一个看起来像这样的数据框“foo”,
Date Return
1998-01-01 0.02
1998-01-02 0.04
1998-01-03 -0.02
1998-01-04 -0.01
1998-01-05 0.02
...
1998-02-01 0.1
1998-02-02 -0.2
1998-02-03 -0.1
etc.
我想向该数据框添加一个新列,显示相应返回的密度值。我尝试过:
foo$density <- for(i in 1:length(foo$Return)) density(foo$Return,
from = foo$Return[i], to = foo$Return[i], n = 1)$y
但没有成功。我真的很难将“函数”应用到每一行。但也许还有另一种方法可以做到这一点,而不是使用密度()?
我本质上想做的是将密度()中的拟合密度值提取到 foo 中的返回值。如果我只是做plot(密度(foo$Return)),它会给我曲线,但是我希望将密度值附加到返回值上。
@Joris:
foo$density <- density(foo$Return, n=nrow(foo$Return))$y
计算一些东西,但似乎返回错误的密度值。
谢谢你帮我! 达尼
I have a dataframe "foo" looking like this
Date Return
1998-01-01 0.02
1998-01-02 0.04
1998-01-03 -0.02
1998-01-04 -0.01
1998-01-05 0.02
...
1998-02-01 0.1
1998-02-02 -0.2
1998-02-03 -0.1
etc.
I would like to add to this dataframe a new column showing me the density value of the corresponding return. I tried:
foo$density <- for(i in 1:length(foo$Return)) density(foo$Return,
from = foo$Return[i], to = foo$Return[i], n = 1)$y
But it didn't work. I really have difficulty applying a "function" to each row. But maybe there is also another way to do it, not using density()?
What I essentially would like to do is to extract the fitted density values from density() to the returns in foo. If I just do plot(density(foo$Return)) it gives me the curve, however I would like to have the density values attached to the returns.
@Joris:
foo$density <- density(foo$Return, n=nrow(foo$Return))$y
calculates something, however seems to return wrong density values.
Thank you for helping me out!
Dani
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
再想一想,忘记密度函数,我突然意识到你想做什么。大多数密度函数返回一个网格,因此不会为您提供精确点的评估。如果你想要这样,你可以使用
sm
包:如果不同值的数量不是那么大,你可以使用 ave() :
如果目的是绘制 密度函数,不需要像你那样计算它。只需使用
Or 在下面添加直方图(注意选项
freq=F
)On second thought, forget about the density function, I suddenly realized what you wanted to do. Most density functions return a grid, so don't give you the evaluation in the exact points. If you want that, you can eg use the
sm
package:If the number of different values is not that big, you can use ave() :
If the purpose is to plot a density function, there's no need to calculate it like you did. Just use
Or, to add a histogram underneath (mind the option
freq=F
)sm.密度
的替代方法是在比默认更精细的网格上评估密度,并使用approx
或approxfun
给出插值您想要的Returns
处的密度。这是一个使用虚拟数据的示例:此时,我们可以使用
approx()
来插值返回密度的x
和y
分量,但我更喜欢approxfun()
,它做同样的事情,但返回一个我们可以用来进行插值的函数。首先,生成插值函数:现在您可以使用
BAR()
返回您希望的任意点的插值密度,例如第一个Returns
:要完成该示例,在
Returns
中添加每个数据的密度:要查看插值的效果,我们可以绘制密度和插值版本并进行比较。请注意,我们必须对
Returns
进行排序,因为要达到我们想要的效果,lines
需要按递增顺序查看数据:这给出了类似这样的结果:
只要在足够精细的一组点上评估密度(上例中为 512*8)您不应该有任何问题,并且很难区分插值版本和真实版本之间的差异。如果您的
Returns
值中有“间隙”,那么您可能会发现,由于lines()
只是连接您要求其绘制的点,因此直线段可能不遵循间隙位置处的黑色密度。这只是间隙和lines()如何工作的人为因素,而不是插值的问题。An alternative to
sm.density
is to evaluate the density on a finer grid than default, and useapprox
orapproxfun
to give the interpolated values of the density at theReturns
you want. Here is an example with dummy data:At this point, we could use
approx()
to interpolate thex
andy
components of the returned density, but I preferapproxfun()
which does the same thing, but returns a function which we can then use to do the interpolation. First, generate the interpolation function:Now you can use
BAR()
to return the interpolated density at any point you wish, e.g. for the firstReturns
:To finish the example, add the density for each datum in
Returns
:To see how well the interpolation is doing, we can plot the density and the interpolated version and compare. Note we have to sort
Returns
because to achieve the effect we want,lines
needs to see the data in increasing order:Which gives something like this:
As long as the density is evaluated at sufficiently fine a set of points (512*8 in the above example) you shouldn't have any problems and will be hard pushed to tell the difference between the interpolated version and the real thing. If you have "gaps" in the values of your
Returns
then you might find that, aslines()
just joins the points you ask it to plot, that straight line segments might not follow the black density at the locations of the gaps. This is just an artefact of the gaps and howlines()
works, not a problem with the interpolation.如果我们忽略 @Joris 专业回答的
密度
问题,那么您似乎还没有掌握如何设置循环。从循环中返回的是值NULL
。这是插入到foo$密度
中的值,它不会起作用,因为它是NULL
,这意味着它是一个空组件,即它不就 R 而言,不存在。有关更多详细信息,请参阅?'for'
。如果要为循环的每次迭代插入返回值,则必须在循环内部进行赋值,这意味着您应该在进入循环之前预先分配存储空间,例如在上面的循环中,如果我们想要在 1,...,10 中得到
i + 1
fori
,我们可以这样做:当然,你不会做这样的计算这是通过循环进行的,因为 R 是矢量化的,并且可以处理数字向量,而不是像在 C 或其他编程语言中那样必须逐个元素地对每个计算元素进行编码。
请注意,R 已将
1
转换为足够长度的1
向量,以允许计算继续进行,这在 R 中称为回收 -说话。有时,您可能需要使用循环或使用
s|l|t|apply()
系列之一来迭代对象,但大多数情况下您会发现一个适用于整个向量的函数一次性收集大量数据。这是 R 相对于其他编程语言的优势之一,但确实需要您进入矢量化模式。If we ignore the
density
issue, which @Joris expertly answers, you don't seem to have grasped how to set up a loop. What you are returning from the loop is the valueNULL
. This is the value that is being inserted infoo$density
and that won't not work because it is theNULL
, which means it is an empty component, i.e. it doesn't exists as far as R is concerned. See?'for'
for further details.If you want to insert the return value for each iteration of the loop, you must do the assignment inside the loop, and that means you should pre-allocate the storage space before you enter the loop, e.g. the above loop if we wanted to have
i + 1
fori
in 1,...,10, we could do this:Of course, you would not do such a calculation as this via a loop, because R is vectorized and will work with vectors of numbers rather than you having to code each computation element by element as you might in C or other programming languages.
Notice that R has turned
1
into a vector of1
s of sufficient length to allow the computation to proceed, something known as recycling in R-speak.Sometimes, you might need to iterate over an object with a loop or using one of the
s|l|t|apply()
family, but most often you will find a function that works for an entire vector of data in one go. This is one of the advantages of R over other programming languages, but does require you to get your head into vectorized mode.用它来获取密度值。
Use this to obtain density values.