机器学习-分类算法

发布于 2025-01-08 21:45:40 字数 448 浏览 4 评论 0原文

我想找到以下概率：

P(y=1/n=k; thetha)

读作：

概率，预测为 1 类，给定单词数 = k，由 thetha 参数化

传统分类没有条件概率（右）

P(y = 1; thetha)

我该如何解决这个问题？

编辑：

例如，假设我想根据附件的数量来预测电子邮件是否是垃圾邮件。令 y=1 表示垃圾邮件，y=0 表示非垃圾邮件。

那么，

P(y = 1/num_attachements=0; some attributes)
and so on!!

这有意义吗？

原文

I want to find the following probability:

P(y=1/n=k; thetha)

Read as:

Probability, The prediction is class 1 given number of words = k, parametrized by thetha

A traditional classification doesn't have the conditional probability (right)

P(y = 1; thetha)

How do I solve this?

EDIT:

For example, lets say I want to predict whether an email is spam or not based on the number of attachments.
Let y=1 indicate spam and y=0 be non-spam.

So,

P(y = 1/num_attachements=0; some attributes)
and so on!!

Is it making any sense?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

孤单情人 2025-01-15 21:45:40

通常附件数量只是另一个属性，所以你的概率是相同的

P(y = 1 | all attributes)

是布尔值），你可以单独计算它们，然后组合为：

P(C|A, B) = P(C|A) * P(C|B) / P(C)

但是，如果你对附件有一些特殊处理（例如，其他属性是数字而附件 code>C 代表事件 y = 1，A - 代表附件，B 代表其他属性。

有关几种 Nave Bayes 分类器的描述，请参阅本文。

Normally number of attachments is just another attribute, so your probability is the same as

P(y = 1 | all attributes)

However, if you have some special treatment of attachment (say, other attributes are numeric and attachment is boolean) you can compute them separately and then combine as: