Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 10 years ago.
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
接受
或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
发布评论
评论(4)
在 Kaggle 上,您可以找到一些竞赛并下载相关数据集。
有一个系统可以实时对您的解决方案进行评分,您将在“实时排行榜”上看到自己的位置。
这是学习机器学习技术的好方法,因为选择“知识”竞赛,您可以将您的解决方案与其他参与者进行比较,并讨论各种方法的优点和缺点。
On Kaggle you can find some competitions and download the associated datasets.
There is a system that scores your solutions in real time and you'll see your place on the "live leaderboard".
It's a good way of studying machine learning techniques because choosing a "for knowledge" competition you can compare your solution with other participants and discuss strengths and weaknesses of various approaches.
试试我的博客 Vellum Information,其中有一些带注释的参考书目,整理了数据集和数据源:
http://velluminformation.com/2014/03/05/big-data-public-databases-an-annotated-bibliography/。
我有一份带有注释的各种可用数据源的参考书目。我在这里还提供了带注释的健康数据参考书目:
http://velluminformation.com/2012/05/19/free-online-public-data-sources-an-annotated-bibliography/。
明显的披露,这是我的博客,所以那里还有其他技术内容。
Try my blog, Vellum Information, where I've got several annotated bibliographies curating data sets and data sources:
http://velluminformation.com/2014/03/05/big-data-public-databases-an-annotated-bibliography/.
I've got an annotated bibliography of various data sources that are available. I've also got an annotated bibliography for health data here:
http://velluminformation.com/2012/05/19/free-online-public-data-sources-an-annotated-bibliography/.
Obvious disclosure, this is my blog, so there are other technical things on there as well.
UCI 机器学习档案 以及 KDD Cup 可能是最知名的通用数据挖掘档案。更具体的来源示例是 UCR 时间序列分类/聚类页面。
The UCI Machine Learning Archive and the past datasets of the KDD Cup are probably the best known such archives for general data mining. An example of a more specific kind of source is the UCR Time Series Classification/Clustering Page.
这是来自 DataWrangling.com 的一篇文章,其中列出了数百个数据集。
Here's an article from DataWrangling.com that lists hundreds of datasets.