如何计算向量中的每个元素在另一个较小向量中元素的分数？

发布于 2024-11-08 15:20:59 字数 487 浏览 0 评论 0原文

n<-100000   
aa<-rnorm(n)
bb<-rnorm(n)
system.time(lapply(aa, function(z){mean(bb<pnorm(z))}))

运行这个小代码需要太长时间。简而言之，我有两个向量 aa 和 bb。对于 aa 的每个元素，例如 aa[i]，我想要 bb bb 的比例。 aa[i]

我找到了这篇文章并尝试用它来加速。但这不起作用。 sapply 与复合函数的速度比较

任何帮助都会赞赏！

原文

n<-100000   
aa<-rnorm(n)
bb<-rnorm(n)
system.time(lapply(aa, function(z){mean(bb<pnorm(z))}))

It takes too long to run this small code. Simply put, I have two vectors aa and bb. For each element of aa, say aa[i], I want the proportion of bb < aa[i]

I found this article and tried to use it to speed up. But it does not work.
Speed comparison of sapply with a composite function

Any help will be appreciated!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

夏日落 2024-11-15 15:20:59

您也许可以使用 findInterval 函数：

n <- 25000
aa <- rnorm(n)
bb <- rnorm(n)
system.time(q1 <- lapply(aa, function(z){mean(bb<pnorm(z))}))
#   user  system elapsed
# 20.057   2.544  22.807
system.time(q2 <- findInterval(pnorm(aa), sort(bb))/n)
#   user  system elapsed
#  0.020   0.000   0.021
all.equal(as.vector(q1, "numeric"), q2)
# [1] TRUE

请注意，findInterval 返回索引，因此我将结果除以 n。如果您可以在将 pnorm(aa) 提供给 findInterval 之前对它进行排序，速度会更快。

You may be able to use the findInterval function:

n <- 25000
aa <- rnorm(n)
bb <- rnorm(n)
system.time(q1 <- lapply(aa, function(z){mean(bb<pnorm(z))}))
#   user  system elapsed
# 20.057   2.544  22.807
system.time(q2 <- findInterval(pnorm(aa), sort(bb))/n)
#   user  system elapsed
#  0.020   0.000   0.021
all.equal(as.vector(q1, "numeric"), q2)
# [1] TRUE

Note that findInterval returns indices, so I've divided the result by n. If you can sort pnorm(aa) before giving it to findInterval, it will be even faster.

回复收藏 0 原文

两个我 2024-11-15 15:20:59

我无意开玩笑，但这些是 R 旨在解决的问题类型，而无需进行每一次计算 - 即使用统计！

假设分布呈正态分布...

aa.new <- sample(aa, 1000)
bb.new <- sample(bb, 1000)

x <- lapply(aa.new, function(z){mean(bb.new<pnorm(z))})
x <- unlist(x)

mean(x)

您可以 99% 确定 bb < 的比例aa[i] 落在平均值 (x) 的 +/- 4% 之间。

对于简单随机抽样，99% 误差范围 = 1.29/sqrt(n)

I'm not meaning to be facetious but these are the types of problems that R is designed to solve without having to do every single calculation - ie, use statistics!

Assuming that the distributions are normal...

aa.new <- sample(aa, 1000)
bb.new <- sample(bb, 1000)

x <- lapply(aa.new, function(z){mean(bb.new<pnorm(z))})
x <- unlist(x)

mean(x)

You can be 99% certain that the proportion of bb < aa[i] falls between +/- 4% of mean(x).

For simple random sampling, 99% margin of error = 1.29/sqrt(n)

回复收藏 0 原文

挽手叙旧 2024-11-15 15:20:59

如果你只想要比例 ' < aa[i]' 那么你应该确定 bb 的数量小于 aa 的每个值，然后除以长度：

bbs <- sort(bb)
zz <- findInterval(aa, bbs)
zz <- zz/length(aa)

它会按照你所说的进行操作，而我担心你的代码不会。

If you only want the proportion ' < aa[i]' then you should just determine the number of bb less than than each value of aa and then divide by length:

bbs <- sort(bb)
zz <- findInterval(aa, bbs)
zz <- zz/length(aa)

It does what you say you want, while your code I fear does not.

回复收藏 0 原文

~没有更多了~

关于作者

偏闹i

暂无简介

0 文章

0 评论

22 人气

关注发私信

友情链接

文江博客

如何计算向量中的每个元素在另一个较小向量中元素的分数？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

lorenzathorton8

Zero

萧瑟寒风

mylayout

tkewei

17818769742

友情链接

如何计算向量中的每个元素在另一个较小向量中元素的分数？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

lorenzathorton8

Zero

萧瑟寒风

mylayout

tkewei

17818769742

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。