Ruby 中的 XML 到哈希表:解析历史发明列表
我想将以下有关历史发明的数据放入方便的 Ruby 数据结构中:
http://yootles。 com/outbox/inventions.xml
请注意,所有数据都在 XML 属性中。
似乎应该有一个只需几行代码的快速解决方案。 对于Rails,会有Hash.from_xml,尽管我不确定它是否能正确处理属性。 无论如何,我需要将其作为独立的 Ruby 脚本。 对于这个简单的任务,Nokogiri 似乎过于复杂,基于有人针对类似问题发布的代码:http://gist.github.com/335286。 我发现了一个据称使用 hpricot 的简单解决方案但它似乎不处理 XML 属性。 也许这是一个简单的扩展? 最后还有 ROXML,但它看起来比 nokogiri 更重量级。
为了使问题具体化(并且具有明显的不可告人的动机),我们假设答案应该是一个完整的 Ruby 脚本,该脚本从上面的 URL 中获取 XML 并吐出 CSV,如下所示:
id, invention, year, inventor, country
RslCn, "aerosol can", 1926, "Erik Rotheim", "Norway"
RCndtnng, "air conditioning", 1902, "Willis Haviland Carrier", "US"
RbgTmtv, "airbag, automotive", 1952, "John Hetrick", "US"
RplnNgnpwrd, "airplane, engine-powered", 1903, "Wilbur and Orville Wright", "US"
我将研究自己的答案并将其发布除非有人用明显更优越的东西抢先一步。谢谢!
I'd like to slurp the following data about historical inventions into a convenient Ruby data structure:
http://yootles.com/outbox/inventions.xml
Note that all the data is in the XML attributes.
It seems like there should be a quick solution with a couple lines of code.
With Rails there'd be Hash.from_xml though I'm not sure that would handle the attributes properly.
In any case, I need this as a standalone Ruby script.
Nokogiri seems overly complicated for this simple task based on this code that someone posted for a similar problem: http://gist.github.com/335286.
I found a purportedly simple solution using hpricot but it doesn't seem to handle the XML attributes.
Maybe that's a simple extension?
Finally there's ROXML but that looks even more heavyweight than nokogiri.
To make the question concrete (and with obvious ulterior motives), let's say that an answer should be a complete Ruby script that slurps the XML from the above URL and spits out CSV like this:
id, invention, year, inventor, country
RslCn, "aerosol can", 1926, "Erik Rotheim", "Norway"
RCndtnng, "air conditioning", 1902, "Willis Haviland Carrier", "US"
RbgTmtv, "airbag, automotive", 1952, "John Hetrick", "US"
RplnNgnpwrd, "airplane, engine-powered", 1903, "Wilbur and Orville Wright", "US"
I'll work on my own answer and post it too unless someone beats me to the punch with something clearly superior. Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
使用 REXML 和 open-uri:
Using REXML and open-uri:
事实证明,Nokogiri 比我想象的要简单:
It turned out to be simpler than I thought with Nokogiri: