用于时间和产品预测的序列挖掘
我面临着一个关于序列挖掘的棘手问题,假设我有 10 个产品,我有数百万条记录,每个记录都包含用户、产品和购买时间戳。每个用户可能只有 1 条记录或 100 条记录。 例如:
user 1, p1, t1
user 1, p1, t2
user 1, p2, t3
user 1, p3, t4
user 1, p1, t5
user 2, p2, t6.....
现在我需要预测什么时候是向用户推广产品的最佳时机。
到目前为止,我的解决方案是将时间分为几类。然后对数据应用 Apriori,例如记录将类似于
user 1, p1T1
user 1, p2T2
user 1, p3T2
user 1, p2T1...
然后我将得到诸如 p1T1->p2T2 等规则, 因为T3>T2>T1...任何不符合此条件的规则都将被丢弃。
但是,我对这个解决方案不是很满意。有什么建议吗?
I am facing a tricky problem about sequence mining, say I have 10 products, I have millions of records each containing user, product and timestamp of purchase . Each user may have only 1 record or 100 records..
such as :
user 1, p1, t1
user 1, p1, t2
user 1, p2, t3
user 1, p3, t4
user 1, p1, t5
user 2, p2, t6.....
Now I need to predict when it's the best time to promote a product for a user.
So far, my solution is, clustering the time into a few categories. Then apply Apriori on the data, e.g the records will be like
user 1, p1T1
user 1, p2T2
user 1, p3T2
user 1, p2T1...
Then I will get rules like p1T1->p2T2 etc,
because T3>T2>T1... any rules do not fit this condition will be discarded.
However, I am not very satisfied with this solution. Any suggestions?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以应用顺序模式挖掘算法(例如 PrefixSpan、SPAM、GSP)或顺序规则挖掘算法,而不是应用 Apriori。
您可以在我的网站上查看这些算法的开源 Java 源代码和一些示例:
http:// /www.philippe-fournier-viger.com/spmf/
希望这有帮助,
Instead of applying Apriori, you could apply a sequential pattern mining algorithm (e.g. PrefixSpan, SPAM, GSP) or a sequential rule mining algorithm.
You can check my website for open-source Java source code for these algorithms and some examples:
http://www.philippe-fournier-viger.com/spmf/
Hope this helps,
你的问题是推荐系统的应用,你可以从KDD cup 2011中学到一些东西。虽然推荐的项目是音乐,但是模型也可以满足您的要求。
而且大多数模型都会考虑时间,如果你仍然不满意,你应该学习一些时间序列分析和机器学习来进行预测。
Your problem is an application of recommender system, you can learn something from the KDD cup 2011. Although the items being recommended is music, but the models can also meet your request.
And most of the models take time into account, if you still get not satisfied, you should learn something about time series analysis and machine learning to make prediction.