网站数据检索
最近的一篇文章促使我选择了一个我已经从事了一段时间的项目。 我想为多个站点创建一个 Web 服务前端,以允许自动完成表单并从结果和站点的其他区域检索数据。 我使用 Selenium 和自定义代码取得了一定程度的成功,但是我希望将其扩展到添加额外的阶段网站是一项微不足道的任务(也许甚至不需要开发人员)。
Kapow Web 数据服务器 看起来可以实现很多目标然而我被告知它相当昂贵(目前正在等待报价)。 有没有人有这方面的经验,或者可以提出任何替代方案(最好是开源)?
免责声明:我意识到从第三方网站自动检索数据存在潜在的合法性问题 - 该工具设计用于价格比较系统,并且与其集成的所有网站都将通过 Express 完成业主的许可。 如果网站提供 API,这显然是最受欢迎的方法。
谢谢
An recent article has prompted me to pick up a project I have been working on for a while. I want to create a web service front end for a number of sites to allow automated completion of forms and data retrieval from the results, and other areas of the site. I have acheived a degree of success using Selenium and custom code however I am looking to extend this to a stage where adding additional sites is a trivial task (maybe one which doesn't require a developer even).
The Kapow web data server looks to achieve a lot of this however I am told it is quite expensive (currently awaiting a quote). Has anyone had experience with this, or can suggest any alternatives (Open Source ideally)?
Disclaimer: I realise the potential legality issues around automating data retrieval from 3rd party websites - this tool is designed to be used in a price comparison system and all of the websites integrated with it will be done with the express permission of the owners. Where the sites provide an API this will clearly be the favoured approach.
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
意识到我发布这篇文章已经有一段时间了,但是如果有人遇到它,我在使用 WSO2 框架(特别是混搭服务器)。 对于数据挖掘任务,我还使用了它所包装的 Java 库 - webharvest - 它已经实现了我需要的一切
Realised it's been a while since I posted this, however should anyone come across it, I have had lots of success in using the WSO2 framework (particularly the mashup server) for this. For data mining tasks I have also used a Java library that this wraps - webharvest - which has achieved everything I needed