与社交网络上的数据挖掘和游戏相关的资源
我对社交网络游戏玩家之间的模式挖掘问题感兴趣。例如,根据公司的用户数据库检测游戏的作弊者。到目前为止,我一直遵循数据挖掘项目的常规方法:
- 构建一个聚合重要信息的数据仓库
- 选择一个分类器,并使用仓库中的一小部分记录对其进行训练
- 使用另一个测试集验证分类器
- 起泡,冲洗,重复
令人惊讶的,我在这个领域发现的有关文献、最佳实践等的资料很少。我希望在这里众包信息收集问题。具体来说,我正在寻找:
- 分类器对于这种类型的模式挖掘有效(它似乎非常暂时,用户玩游戏,用户接收奖励,用户转移奖品等)。
- 社交网络/游戏数据是否有任何高度一致的特定属性?
- 应考虑的实际信息量是多少?我遇到的一个问题是数据过载,查询和数据清理可能需要几天才能完成。
- 与上述观点相关,需要哪些硬件资源才能产生结果?我发现很难估计生产使用所需的计算能力。很明显,角落里的白盒子没有足够的马力来完成这样的项目。公司普遍采用云解决方案吗?他们购买集群吗?
基本上,任何有关实施社交网络/游戏模式挖掘程序的资源(理论、学术或实践)都将受到非常赞赏。
谢谢。
I'm interested in the problem of patterning mining among players of social networking games. For example detecting cheaters of a game, given a company's user database. So far I have been following the usual recipe for a data mining project:
- construct a data warehouse that aggregates significant information
- select a classifier, and train it with a subsectio of records from the warehouse
- validate classifier with another test set
- lather, rinse, repeat
Surprisingly, I've found very little in this area regarding literature, best practices, etc. I am hoping to crowdsource the information gathering problem here. Specifically what I'm looking for:
- What classifiers have worked will for this type of pattern mining (it seems highly temporal, users playing games, users receiving rewards, users transferring prizes etc).
- Are there any highly agreed upon attributes specific to social networking / gaming data?
- What is a practical amount of information that should be considered? One problem I've run into is data overload, where queries and data cleansing may take days to complete.
- Related to point above, what hardware resources are required to produce results? I've found it difficult to estimate the amount of computing power I will require for production use. It has become apparent that a white box in the corner does not have enough horse-power for such a project. Are companies generally resorting to cloud solutions? Are they buying clusters?
Basically, any resources (theoretical, academic, or practical) about implementing a social networking / gaming pattern-mining program would be very much appreciated.
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我正在寻找同样类型的资源,以下是我发现的一些我认为很有趣的东西,希望您可以利用它,如果您发现更多资源,请告诉我。
他们在这里:
http://techcrunch.com/2010/04/06/turiya-media-游戏/
http://www.kdnuggets.com/2010/08/video-tutorial-christian-thurau-data-mining-in-games.html?k10n21
http://www.gamasutra.com/view/feature/2816/better_game_design_through_data_.php
这是葡萄牙语,但非常棒:http://thiagofalcao.info/
I am looking for the same kind of resources, here are some things I found that I consider pretty interesting, hope you can take advantage of it, please if you discover more resources let me know.
Here they are:
http://techcrunch.com/2010/04/06/turiya-media-games/
http://www.kdnuggets.com/2010/08/video-tutorial-christian-thurau-data-mining-in-games.html?k10n21
http://www.gamasutra.com/view/feature/2816/better_game_design_through_data_.php
This is in portuguesse but is excelent: http://thiagofalcao.info/