关联规则挖掘和关联规则挖掘有什么区别频繁项集挖掘

发布于 2024-09-05 16:24:02 字数 115 浏览 3 评论 0原文

我是数据挖掘新手,对关联规则和频繁项挖掘感到困惑。对我来说,我认为两者是相同的,但我需要这个论坛上专家的意见

我的问题是

关联规则挖掘和关联规则挖掘之间有什么区别?频繁项集挖掘? 谢谢

i am new to data mining and confuse about Association rules and frequent item mining. for me i think both are same but i need views from experts on this forum

My question is

what is the difference between Association rule mining & frequent itemset mining?
Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

淡墨 2024-09-12 16:24:02

关联规则类似于“A,B → C”,意味着当 A 和 B 出现时,C 往往会出现。项目集只是一个集合,例如“A,B,C”,如果它的项目倾向于同时出现,那么它就是频繁的。 查找关联规则的通常方法是查找所有频繁项集,然后将它们后处理为规则。

An association rule is something like "A,B → C", meaning that C tends to occur when A and B occur. An itemset is just a collection such as "A,B,C", and it is frequent if its items tend to co-occur. The usual way to look for association rules is to find all frequent itemsets and then postprocess them into rules.

氛圍 2024-09-12 16:24:02

频繁项集挖掘的输入是:

  • 事务数据库
  • 最小支持度阈值minsup

输出是:

  • 至少出现在minsup中的所有项集的集合> 交易。项目集只是一组无序的项目。

关联规则挖掘的输入是:

  • 事务数据库
  • 最小支持度阈值minsup
  • 最小置信度阈值minconf

输出是:

  • 所有的集合有效的关联规则。关联规则X→Y是两个项集X和Y之间的关系,使得X和Y不相交并且不为空。有效规则是支持度高于或等于 minsup 且置信度高于或等于 minconf 的规则。支持度定义为sup(x-->Y)=sup(XUY)/(交易数量)。置信度定义为conf(x-->Y)=sup(XUY)/sup(X)。

现在项集和关联规则挖掘之间的关系是,使用频繁项集来生成规则是非常有效的(参见 Agrawal 1993 的论文)以获取有关此想法的更多详细信息。因此关联规则挖掘将分为两个步骤:
- 挖掘频繁项集
- 使用频繁项集生成所有有效的关联规则。

The input of frequent itemset mining is :

  • a transaction database
  • a minimum support threshold minsup

The output is :

  • the set of all itemsets appearing in at least minsup transactions. An itemset is just a set of items that is unordered.

The input of assocition rule mining is :

  • a transaction database
  • a minimum support threshold minsup
  • a minimum confidence threshold minconf

The output is :

  • the set of all valid association rule. An association rule X-->Y is a relationship between two itemsets X and Y such that X and Y are disjoint and are not empty. A valid rule is a rule having a support higher or equals to minsup and a confidence higher or equal to minconf. The support is defined as sup(x-->Y) = sup (X U Y) / (number of transactions). The confidence is defined as conf(x-->Y) = sup (X U Y) / sup (X).

Now the relationship between itemset and association rule mining is that it is very efficient to use the frequent itemset to generate rules (see the paper by Agrawal 1993) for more details about this idea. So association rule mining will be broken down into two steps:
- mining frequent itemsets
- generating all valid association rules by using the frequent itemsets.

比忠 2024-09-12 16:24:02

频繁项集挖掘是关联规则挖掘的第一步。
一旦生成了所有的频繁项集,就可以对它们进行逐一迭代,枚举所有可能的关联规则,计算它们的置信度,最后,如果置信度>1,则计算出它们的置信度。 minConfidence,您输出该规则。

Frequent itemset mining is the first step of Association rule mining.
Once you have generated all the frequent itemsets, you proceed by iterating over them, one by one, enumerating through all the possible association rules, calculate their confidence, finally, if the confidence is > minConfidence, you output that rule.

原谅我要高飞 2024-09-12 16:24:02

频繁项集挖掘是关联规则挖掘的一个步骤。对数据应用 Apriori、FPGrowth 等频繁项集挖掘算法后,您将得到频繁项集。从这些
发现频繁项集,您将生成关联规则(通常通过子集生成完成)。

Frequent itemset mining is a step of Association rules mining. After applying Frequent itemset mining algorithm like Apriori, FPGrowth on data, you will get frequent itemsets. From these
discovered frequent itemsets, you will generate association rules(Usually done by subset generation).

陌上青苔 2024-09-12 16:24:02

通过使用关联规则挖掘,我们将获得给定数据集中出现的频繁项集。它还提供了不同类型的算法来挖掘频繁项集,但以不同的方式完成,即水平或垂直格式。 Apriori算法遵循水平格式来挖掘频繁项集,而eclat算法遵循垂直格式来挖掘频繁数据集。

By using Association rule mining we will get the frequently itemsets that present in the given dataset. it also provide different types of algorithms for mining the frequent itemsets but it is done in different way that means either horizontal or vertical format. Apriori algorithm follow the horizontal format for mining the frequent itemsets and eclat algorithm follow the vertical format for mining the frequent datasets.

对风讲故事 2024-09-12 16:24:02

关联规则挖掘:

关联规则挖掘用于发现数据中的模式。它发现一起出现且相关的特征。

  • 示例:

例如,购买尿布的人可能会购买婴儿爽身粉。或者我们可以将这个陈述改写为:如果(人们购买尿布),那么(他们购买婴儿爽身粉)。注意 if,then 规则。这并不一定意味着如果人们购买婴儿爽身粉,他们就会购买尿布。一般来说,我们可以说,如果条件A趋向于B,并不一定意味着B趋向于A。

频繁项集挖掘:

频繁项集挖掘用于发现数据中的公共项集。它可以从给定的事务数据集生成关联规则。

  • 示例:

如果经常购买 2 件商品 X 和 Y,那么最好将它们放在商店中,或者在购买另一件商品时为其中一件商品提供一些折扣优惠。这样确实可以增加销量。例如,可能会发现,如果顾客购买牛奶和面包,他/她也会购买黄油。
所以关联规则是['牛奶]^['面包']=>['黄油']。因此,如果顾客购买牛奶和面包,卖家可以建议顾客购买黄油。

Association Rule mining:

Association rule mining is used to find the patterns in data.it finds the features which occur together and correlated.

  • Example:

For example, people who buy diapers are likely to buy baby powder. Or we can rephrase the statement by saying: If (people buy diaper), then (they buy baby powder). Note the if, then rule. This does not necessarily mean that if people buy baby powder, they buy diaper. In General, we can say that if condition A tends to B it does not necessarily mean that B tends to A.

Frequent item set mining:

Frequent item set mining is used to find the common item sets in data. it can generate association rules from the given transactional datasets.

  • Example:

If there are 2 items X and Y purchased frequently then its good to put them together in stores or provide some discount offer on one item on purchase of other item. This can really increase the sales. For example it is likely to find that if a customer buys Milk and bread he/she also buys Butter.
So the association rule is [‘milk]^[‘bread’]=>[‘butter’]. So seller can suggest the customer to buy butter if he/she buys Milk and Bread.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文