R 中的概率估计

发布于 2024-10-09 03:20:45 字数 338 浏览 0 评论 0原文

我正在尝试使用 R 分析一些概率数据。我所拥有的数据给出了给定概率 p 的某些结果(AB)的频率em> 我想要的是一个模型,它允许我仅根据频率数据来估计 p

现在我只是运行一个线性回归(类似于 lm(p ~ A + B) ),它或多或少有效,但我知道这不是“正确的方法”。特别是,对于 AB 的某些值,我当前的模型将返回不在区间 [0, 1],即对于概率无效。

我很确定有一种方法可以做到这一点,但我一生都无法弄清楚该模型的名称或如何在 R 中运行它。任何人都可以给我提示吗?

I am trying to analyze some probability data with R. The data I have gives the frequency of certain outcomes (A and B) for a given probability p and what I want is a model that will allow me to estimate p from only the frequency data.

Right now I am just running a linear regression (something like lm(p ~ A + B)) which works more or less but I know that this is not the "right way" to do it. In particular, my current model will, for some values of A or B, return values that do not lie within the interval [0, 1], i.e. that are not valid for a probability.

I am pretty sure there is a way to do this, but I can't for the life of me figure out what the model was called or how to run it in R. Can anybody give me a hint?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

手心的温暖 2024-10-16 03:20:45

您不能只运行 lm(p ~ A + B),因为没有模型将您的计数变量 A 和 B 与概率相关联:< code>lm() 拟合线性回归,将无界实数变量建模为实数变量线性组合的函数(您可以替换计数变量)。

最简单的概率模型是使用逻辑函数的逻辑回归从无界实数值到有界区间 [0,1]。您可以使用 glm() 在 R 中拟合逻辑回归,以及用于特殊情况的许多附加包,请参见 此 rseek.org 搜索逻辑回归

此外,CrossValidated 是对此类问题进行建模的好网站。

You cannot just run lm(p ~ A + B) as there is no model relating your count variables A and B with the probabilities: lm() fits a linear regression to model an unbounded real variable as a function of a linear combination of real variables (where you can substitute count variables).

The easiest model for probabilities is a logistic regresion which uses a logistic function to make from unbounded real values to the bounded interval [0,1]. You can fit logistic regression in R using glm() as well as a number of add-on packages for special cases, see e.g. this rseek.org search for logistic regression.

Also, CrossValidated is a good site for modeling questions such as this.

街角卖回忆 2024-10-16 03:20:45

泊松回归,在 R 中使用带有 family="poisson" 的 glm 函数(带有默认的对数链接)实现,将估计一个对数线性模型,该模型可以非常直接地用于估计概率。根据您设置数据集输入的方式,您可以通过 exp(linear.predictor) 获取比例或比率。这与您当前使用 lm() 设置为 lm(log(p)~ A+B) 有点相似,但错误更适合计数。 Zeileis 等人为 pscl 包所做的文章 目前该方法在其他分析计数数据的方法中尤其出色(参见第 3.2 节):

Poisson regression, implemented in R with the glm function with family="poisson" (with a default log link) , would estimate a log-linear model which very directly can be used to estimate probabilities. Depending in how you set up the input of the dataset you can get either proportions or rates by exp(linear.predictor). It would be somewhat similar to you current use of lm() set up as lm(log(p)~ A+B) but the errors are more appropriate to counts. The piece that Zeileis, el al, did for package pscl is particularly good at presently the method in the context of other methods for analyzing count data (see section 3.2):

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文