从 Scrubyt 过渡到 Nokogiri - 写入 XML 还是哈希?
我正在尝试将这段代码从 scrapyt 转换为 nokogiri,但我一直试图将结果写入哈希或 xml。在 scrapyt 中,它看起来如下所示:
require 'rubygems'
require 'scrubyt'
result_data = Scrubyt::Extractor.define do
fetch "http://www.amazon.com/gp/offer-listing/0061673730"
results "//div[@class='resultsset']" do
item "//tbody/tr" do
condition "//div[@class = 'Condition']"
price "//span[@class = 'price']"
shipping "//span[@class = 'price_shipping']"
end
end
end
@description = result_data.to_xml
return @description
end
使用 nokogiri,我可以解析出我想要的信息,但似乎没有一种快速的方法来返回散列或 xml 文档中的项目。这就是我在 nokogiri 中所拥有的一切。
require 'rubygems'
require 'nokogiri'
require 'open-uri'
doc = Nokogiri::HTML(open('http://www.amazon.com/gp/offer-listing/0061673730'))
doc.css('div.condition, span.price, span.price_shipping ').each do |item|
puts item.content
end
如何将项目信息返回到 xml 或散列?
I'm trying to transition this bit of code from scrubyt to nokogiri, and am stuck trying to write my results to either a hash or xml. In scrubyt it looks like the following:
require 'rubygems'
require 'scrubyt'
result_data = Scrubyt::Extractor.define do
fetch "http://www.amazon.com/gp/offer-listing/0061673730"
results "//div[@class='resultsset']" do
item "//tbody/tr" do
condition "//div[@class = 'Condition']"
price "//span[@class = 'price']"
shipping "//span[@class = 'price_shipping']"
end
end
end
@description = result_data.to_xml
return @description
end
With nokogiri I can parse out the information I want, but there doesn't seem to be a quick way to return items in a hash or xml document. Here's all I have in nokogiri.
require 'rubygems'
require 'nokogiri'
require 'open-uri'
doc = Nokogiri::HTML(open('http://www.amazon.com/gp/offer-listing/0061673730'))
doc.css('div.condition, span.price, span.price_shipping ').each do |item|
puts item.content
end
How would one return item information to either xml or a hash?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
你可能想省略“=”
xml.价格 = p.内容
you may want to omit "=" in
xml.price = p.content
您可以使用 Builder 构建 XML。
You can use the Builder to build XML.
想通了...
结果:
Figured it out...
Results:
谢谢!这正是我所需要的。不过,我无法正确循环。
这将返回:
我将如何重写我的代码以返回类似这样的内容:
Thanks! That's exactly what I need. I'm having trouble looping correctly, though.
That returns this:
How would I rewrite my code to return something like this: