您应该查看 YQL - 这是 Yahoo! 的一项通用服务!可以非常轻松地进行这种抓取。试试这个:
select * from html where url="google.com" and xpath='//title'
在这里测试一下。
You should have a look at YQL - it's a general-purpose service from Yahoo! that can do this kind of scraping really easily. Try this:
Test it here.
我怀疑 Yahoo! 的 YQL 可能非常接近您正在寻找的内容。
(事实上,我认为对 YQL 的简洁描述是“从网站提取信息的 Web API”:-)
I suspect that Yahoo!'s YQL is probably pretty close to what you're looking for.
(In fact I think that a concise description of what YQL is would be, "a web API to extract information from a website" :-)
您可以使用Rapture Parser。它允许您从网页检索内容和许多其他元数据
You may use Rapture Parser. It allows you to retrieve a content and a lot of other metadata from the web page
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
暂无简介
文章 0 评论 0
接受
发布评论
评论(3)
您应该查看 YQL - 这是 Yahoo! 的一项通用服务!可以非常轻松地进行这种抓取。试试这个:
在这里测试一下。
You should have a look at YQL - it's a general-purpose service from Yahoo! that can do this kind of scraping really easily. Try this:
Test it here.
我怀疑 Yahoo! 的 YQL 可能非常接近您正在寻找的内容。
(事实上,我认为对 YQL 的简洁描述是“从网站提取信息的 Web API”:-)
I suspect that Yahoo!'s YQL is probably pretty close to what you're looking for.
(In fact I think that a concise description of what YQL is would be, "a web API to extract information from a website" :-)
您可以使用Rapture Parser。它允许您从网页检索内容和许多其他元数据
You may use Rapture Parser. It allows you to retrieve a content and a lot of other metadata from the web page