使用 apriori 算法进行推荐
因此,最近的问题让我意识到相当酷的apriori 算法。 我知道它为什么有效,但我不确定实际用途。 据推测,计算相关物品集的主要原因是能够根据某人自己的购买(或拥有的物品等)为他们提供建议。 但是,如何从一组相关的项目组转变为单独的推荐呢?
维基百科文章结束:
第二个问题是生成 来自那些大的关联规则 项集的约束为 最小的信心。 假设其中之一 大项集是 Lk,Lk = {I1, I2, … , Ik}, 与此的关联规则 项集生成于 如下:第一条规则是{I1, I2, … , Ik-1}⇒ {Ik},通过检查 可以确定该规则的置信度 有趣与否。 然后是其他规则 是通过删除最后一个生成的 先行词中的项目和插入 其结果是,进一步 新规则的置信度是 检查以确定 他们的有趣程度。 那些 迭代过程直到 先行词变为空
我也不确定关联规则集如何帮助确定最佳推荐集。 也许我没有抓住重点,apriori 不适合这种用途? 在这种情况下,它的用途是什么?
So a recent question made me aware of the rather cool apriori algorithm. I can see why it works, but what I'm not sure about is practical uses. Presumably the main reason to compute related sets of items is to be able to provide recommendations for someone based on their own purchases (or owned items, etcetera). But how do you go from a set of related sets of items to individual recommendations?
The Wikipedia article finishes:
The second problem is to generate
association rules from those large
itemsets with the constraints of
minimal confidence. Suppose one of the
large itemsets is Lk, Lk = {I1, I2, …
, Ik}, association rules with this
itemsets are generated in the
following way: the first rule is {I1,
I2, … , Ik-1}⇒ {Ik}, by checking the
confidence this rule can be determined
as interesting or not. Then other rule
are generated by deleting the last
items in the antecedent and inserting
it to the consequent, further the
confidences of the new rules are
checked to determine the
interestingness of them. Those
processes iterated until the
antecedent becomes empty
I'm not sure how the set of association rules helps in determining the best set of recommendations either, though. Perhaps I'm missing the point, and apriori is not intended for this use? In which case, what is it intended for?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
因此,apriori 算法不再是市场购物篮分析的最新技术(又名 关联规则挖掘)。 尽管 Apriori 原理(子集的支持度上限是集合的支持度)仍然是驱动力,但这些技术已经得到了改进。
无论如何,关联规则用于生成推荐的方式是,给定一些历史项集,我们可以检查每个规则的先行词以查看是否包含在历史记录中。 如果是这样,那么我们可以推荐规则的结果(当然,排除结果已经包含在历史中的情况)。
我们可以使用各种指标对我们的推荐进行排名,因为使用大量规则,将它们与历史记录进行比较时,我们可能会得到很多命中,并且我们只能做出有限数量的推荐。 一些有用的指标包括规则的支持(与前件和后件并集的支持相同)、规则的置信度(支持度规则的支持度对前件的支持的影响),以及规则的提升(规则的支持对前件和后件的支持的乘积)等等。
So the apriori algorithm is no longer the state of the art for Market Basket Analysis (aka Association Rule Mining). The techniques have improved, though the Apriori principle (that the support of a subset upper bounds the support of the set) is still a driving force.
In any case, the way association rules are used to generate recommendations is that, given some history itemset, we can check each rule's antecedant to see if is contained in the history. If so, then we can recommend the rule's consequent (eliminating cases where the consequent is already contained in the history, of course).
We can use various metrics to rank our recommendations, since with a multitude of rules we may have many hits when comparing them to a history, and we can only make a limited number of recommendations. Some useful metrics are the support of a rule (which is the same as the support of the union of the antecedant and the consequant), the confidence of a rule (the support of the rule over the support of the antecedant), and the lift of a rule (the support of the rule over the product of the support of the antecedant and the consequent), among others.
如果您想了解如何使用 Apriori 进行分类的详细信息,您可以阅读有关 CBA 算法的论文:
Bing Liu、Wynne Hsu、Yiming Ma,“Integrating Classification and Association Rule Mining”。 第四届知识发现和数据挖掘国际会议论文集(KDD-98,全体会议),美国纽约,1998 年
If you want some details about how Apriori can be used for classification you coul read the paper about the CBA algorithm:
Bing Liu, Wynne Hsu, Yiming Ma, "Integrating Classification and Association Rule Mining." Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98, Plenary Presentation), New York, USA, 1998