抓取网页内容
我刚刚开始研究这个问题,我想抓取我的 Netgear 路由器 (http:// /192.168.0.1/setup.cgi?next_file=stattbl.htm) 统计数据到 csv 文件中。
我运行Win & Linux,但主要了解C++,有链接/解决方案吗?
I've just started looking into this, I want to scrape my Netgear Router (http://192.168.0.1/setup.cgi?next_file=stattbl.htm) stats into a csv file.
I run Win & Linux, but mainly know C++, any links/solutions?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
正如 MYYN 所建议的,类似 BeautifulSoup 或 Hpricot 确实擅长这类事情,所以如果你不完全相信它必须用 C++ 编写,你真的应该研究一下那些( python 和 ruby 的基础知识都可以很快掌握,而且肯定比 C++ 简单得多)。或者,查看 QTDOMDocument 和 TinyXML++。
As MYYN suggested, something like BeautifulSoup or Hpricot really excels at this sort of thing, so if you aren't absolutely convinced that it has to be in C++, you really should look into those(the basics of both python and ruby can be picked up pretty quickly and are certainly much simpler than C++). Alternatively, check out QTDOMDocument and TinyXML++.
我知道并编写了 C++,但对于屏幕抓取,我宁愿使用一些脚本语言,例如 python 和一些方便的库,例如 http://www.crummy.com/software/BeautifulSoup/
尤其是在 Linux 上,Python 应该已经安装(或者至少可以通过包管理器轻松安装)。
i know and wrote c++, but for screen scraping i'd rather use some scripting language like python with some handy libraries, e.g. http://www.crummy.com/software/BeautifulSoup/
especially on linux, python should be installed already (or at least easily installable via package managers).