如何获取数据来训练我的ML模型?
我正在建立一个机器学习模型,该模型将在特定位置提出景点。
我有大多数细节解决了。但是,我仍然需要收集景点的数据来训练我的模型。
我可以在某个地方找到一个数据集(我已经检查了Kaggle)吗?如果不是,我应该刮擦哪些网站?
I am building a machine learning model that would suggest attractions in a specific location.
I have most of the details worked out. However, I still need to collect the data of the attractions to train my model.
Is there somewhere I could find a dataset for this (I already checked Kaggle)? If not which websites should I scrape?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果您想刮擦数据,Twitter可能是最容易启动的。您可以使用 twitter api 获得包含特定关键字的任何推文,将所需位置作为关键字输入并使用Tweepy刮擦它,我建议您从诸如影响者或旅行博客之类的特定帐户中刮擦以获取有关吸引力的数据。
申请获取Twitter API可能需要几天的时间,您只能在一周的时间内刮擦推文。比这更年长,您需要注册其优质订阅。
If you want to scrape data, twitter probably is the easiest to start. You can use twitter API to get any tweet that contain a specific keyword or hashtag, input your desired location as the keyword and scrape it using tweepy, i would suggest you to scrape from a specific account like Influencer or travel blog to get data about attraction.
Applying to get twitter API might take several days, and you can only scrape tweet within a time range of a weeks. older than that you need to sign up to their premium subscription.