编写屏幕抓取爬虫
I want to write crawler for screen scraping
What I want is, I want to get price of particular hotel from a website, like here is
website
e.g. In the above URL, there is list of hotels and its price. I want to get the price of the beaufort
Please Advise how to accomplish this.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
使用 HTML 解析库(例如 html 敏捷包)将 html 解析为更可用的模型,然后导航用于查找您感兴趣的 HTML 部分的模型
use a HTML parsing library like the html agility pack to parse the html into a more usable model and then navigate the model to find the bits of the HTML you are interested in
使用 cURL 等工具下载 HTML,然后使用 XPath 选择您感兴趣的标签使用 Firebug 帮助您确定 XPath。
download the HTML with a tool like cURL and then use XPath to select the tags you are interested in. Use Firebug to help you determine the XPath.