快速 Perl t 检验函数

发布于 2024-12-28 16:28:18 字数 163 浏览 1 评论 0原文

我正在使用 perl+R 来分析大型样本数据集。对于每两个样本，我计算 t 检验 p 值。目前，我正在使用 stats::R 模块将值从 perl 导出到 R，然后使用 t.test 函数。然而，这个过程极其缓慢。我想知道是否有人知道一个 perl 函数可以以更有效的方式执行相同的过程。

谢谢！

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

木緿 2025-01-04 16:28:18

数据量、数据集对的数量，甚至您编写的代码可能会帮助我们确定您的代码速度慢的原因。例如，将许多小数据集发送到 R 会很慢，但可以通过一次发送所有数据来加快速度。

对于纯 Perl 解决方案，您首先需要计算检验统计量（这很简单，并且已经在
统计::TTest,
例如），然后将其转换为 p 值（您需要类似 R 的 qt 函数之类的东西，但我不确定它在 Perl 中是否可用 - 您可以发送 T 值到 R，在最后的一个块中，将它们转换为 p 值）。

回复收藏 0 原文

一指流沙 2025-01-04 16:28:18

您还可以尝试 PDL，特别是 PDL::统计。

回复收藏 0 原文

汐鸠 2025-01-04 16:28:18

Statistics::TTest 模块为您提供 p 值。

use Statistics::TTest;

my @r1 = map { rand(10)   } 1..32;
my @r2 = map { rand(10)-2 } 1..32;

my $ttest = new Statistics::TTest;  
$ttest->load_data(\@r1,\@r2);  
say "p-value = prob > |T| = ", $ttest->{t_prob};

稍微玩了一下，我发现这给你的 p 值略低于你从 R 得到的值。 R 显然在做一些降低自由度的事情，但我的统计学知识不足以解释它在做什么或者为什么。（在上面的示例中，差异约为 1%。如果您使用 320 个浮点样本而不是 32 个浮点样本，则差异为 50% 甚至更多，但这是 1e-12 和 1.5e-12 之间的差异。）如果你需要精确的 p 值，你需要小心。

The Statistics::TTest module gives you a p-value.

use Statistics::TTest;

my @r1 = map { rand(10)   } 1..32;
my @r2 = map { rand(10)-2 } 1..32;

my $ttest = new Statistics::TTest;  
$ttest->load_data(\@r1,\@r2);  
say "p-value = prob > |T| = ", $ttest->{t_prob};

Playing around a bit, I find that the p-values that this gives you are slightly lower than what you get from R. R is apparently doing something that reduces the degrees of freedom, but my knowledge of statistics is insufficient to explain what it's doing or why. (In the above example, the difference is about 1%. If you use samples of 320 floats instead of 32, then the difference is 50% or even more, but it's a difference between 1e-12 and 1.5e-12.) If you need precise p-values, you will want to take care.

回复收藏 0 原文

~没有更多了~