使用 Ruby 和 Hpricot 将 xml 转换为 yaml - 这里出了什么问题?

发布于 2024-08-02 06:44:58 字数 4406 浏览 4 评论 0原文

我正在尝试将 xml 文件 blog.xml 输出为 yaml,以便放入 Vision.app(一种用于在本地设计 Shopify 电子商务网站的工具)。

Shopify 的 yaml 看起来像这样:

- id: 2
  handle: bigcheese-blog
  title: Bigcheese blog
  url: /blogs/bigcheese-blog
  articles:
    - id: 1
      title: 'One thing you probably did not know yet...'
      author: Justin
      content: Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
      created_at: 2005-04-04 16:00
      comments:
        - 
          id: 1
          author: John Smith
          email: [email protected]
          content: Wow...great article man.
          status: published
          created_at: 2009-01-01 12:00
          updated_at: 2009-02-01 12:00
          url: ""
        - 
          id: 2
          author: John Jones
          email: [email protected]
          content: I really enjoyed this article. And I love your shop! It's awesome. Shopify rocks!
          status: published
          created_at: 2009-03-01 12:00
          updated_at: 2009-02-01 12:00
          url: "http://somesite.com/"
    - id: 2
      title: Fascinating
      author: Tobi
      content: Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
      created_at: 2005-04-06 12:00
      comments:
  articles_count: 2
  comments_enabled?: true 
  comment_post_url: ""
  comments_count: 2
  moderated?: true

但是,示例 myxml 看起来像这样:

       <article>
          <author>Rouska Mellor</author>
          <blog-id type="integer">273932</blog-id>
          <body>Worn Again are hiring for a new Sales Director.

      To view the full job description and details of how to apply click &quot;here&quot;:http://antiapathy.org/?page_id=83</body>
          <body-html>&lt;p&gt;Worn Again are hiring for a new Sales Director.&lt;/p&gt;
      &lt;p&gt;To view the full job description and details of how to apply click &lt;a href=&quot;http://antiapathy.org/?page_id=83&quot;&gt;here&lt;/a&gt;&lt;/p&gt;</body-html>
          <created-at type="datetime">2009-07-29T13:58:59+01:00</created-at>
          <id type="integer">1179072</id>
          <published-at type="datetime">2009-07-29T13:58:59+01:00</published-at>
          <title>Worn Again are hiring!</title>
          <updated-at type="datetime">2009-07-29T13:59:40+01:00</updated-at>
        </article>
        <article>

我天真地认为从一种序列化数据格式转换为另一种序列化数据格式相当简单,我可以简单地执行此操作:

>> require 'hpricot'
=> true
>> b = Hpricot.XML(open('blogs.xml'))
>> puts b.to_yaml

但我收到此错误。

NoMethodError: undefined method `yaml_tag_subclasses?' for Hpricot::Doc:Class
    from /usr/local/lib/ruby/1.8/yaml/tag.rb:69:in `taguri'
    from /usr/local/lib/ruby/1.8/yaml/rubytypes.rb:16:in `to_yaml'
    from /usr/local/lib/ruby/1.8/yaml.rb:391:in `call'
    from /usr/local/lib/ruby/1.8/yaml.rb:391:in `emit'
    from /usr/local/lib/ruby/1.8/yaml.rb:391:in `quick_emit'
    from /usr/local/lib/ruby/1.8/yaml/rubytypes.rb:15:in `to_yaml'
    from /usr/local/lib/ruby/1.8/yaml.rb:117:in `dump'
    from /usr/local/lib/ruby/1.8/yaml.rb:432:in `y'
    from (irb):6
    from :0
>>

如何获得本问题顶部概述的表格中的数据输出?我尝试导入“yaml”gem,认为我缺少其中一些方法,但这也没有帮助:

I'm trying to output an xml file blog.xml as yaml, for dropping into vision.app, a tool for designing shopify e-commerce sites locally.

Shopify's yaml looks like this:

- id: 2
  handle: bigcheese-blog
  title: Bigcheese blog
  url: /blogs/bigcheese-blog
  articles:
    - id: 1
      title: 'One thing you probably did not know yet...'
      author: Justin
      content: Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
      created_at: 2005-04-04 16:00
      comments:
        - 
          id: 1
          author: John Smith
          email: [email protected]
          content: Wow...great article man.
          status: published
          created_at: 2009-01-01 12:00
          updated_at: 2009-02-01 12:00
          url: ""
        - 
          id: 2
          author: John Jones
          email: [email protected]
          content: I really enjoyed this article. And I love your shop! It's awesome. Shopify rocks!
          status: published
          created_at: 2009-03-01 12:00
          updated_at: 2009-02-01 12:00
          url: "http://somesite.com/"
    - id: 2
      title: Fascinating
      author: Tobi
      content: Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
      created_at: 2005-04-06 12:00
      comments:
  articles_count: 2
  comments_enabled?: true 
  comment_post_url: ""
  comments_count: 2
  moderated?: true

However, sample myxml looks like this:

       <article>
          <author>Rouska Mellor</author>
          <blog-id type="integer">273932</blog-id>
          <body>Worn Again are hiring for a new Sales Director.

      To view the full job description and details of how to apply click "here":http://antiapathy.org/?page_id=83</body>
          <body-html><p>Worn Again are hiring for a new Sales Director.</p>
      <p>To view the full job description and details of how to apply click <a href="http://antiapathy.org/?page_id=83">here</a></p></body-html>
          <created-at type="datetime">2009-07-29T13:58:59+01:00</created-at>
          <id type="integer">1179072</id>
          <published-at type="datetime">2009-07-29T13:58:59+01:00</published-at>
          <title>Worn Again are hiring!</title>
          <updated-at type="datetime">2009-07-29T13:59:40+01:00</updated-at>
        </article>
        <article>

I naively assumed converting from one serialised data format to another was fairly straightforward, and I could simply do this:

>> require 'hpricot'
=> true
>> b = Hpricot.XML(open('blogs.xml'))
>> puts b.to_yaml

But I'm getting this error.

NoMethodError: undefined method `yaml_tag_subclasses?' for Hpricot::Doc:Class
    from /usr/local/lib/ruby/1.8/yaml/tag.rb:69:in `taguri'
    from /usr/local/lib/ruby/1.8/yaml/rubytypes.rb:16:in `to_yaml'
    from /usr/local/lib/ruby/1.8/yaml.rb:391:in `call'
    from /usr/local/lib/ruby/1.8/yaml.rb:391:in `emit'
    from /usr/local/lib/ruby/1.8/yaml.rb:391:in `quick_emit'
    from /usr/local/lib/ruby/1.8/yaml/rubytypes.rb:15:in `to_yaml'
    from /usr/local/lib/ruby/1.8/yaml.rb:117:in `dump'
    from /usr/local/lib/ruby/1.8/yaml.rb:432:in `y'
    from (irb):6
    from :0
>>

How can I get the data output in the form outlined at the top of this question? I've tried importing the 'yaml' gem, thinking that I'm missing some of those methods, but that hasn't helped either:

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

任性一次 2024-08-09 06:44:59

我找到了这个。也许会有帮助。
http://brains.parslow.net/node/1623

I've found this. Maybe it could help.
http://brains.parslow.net/node/1623

胡渣熟男 2024-08-09 06:44:58

抱歉,Josh,我认为您在这里发现的是 Hpricot 和/或 YAML 库的限制,纯粹而简单。

我不确定 Hpricot 是否曾经以这种方式支持 YAML。有问题的方法由 YAML 库动态添加到 Object 类以及其他基本 Ruby 类型,但由于某种原因没有出现在 Hpricot::Doc 的定义中,尽管 Hpricot::Doc 似乎确实继承了间接来自对象。

我可以说我也复制了它,所以不仅仅是你。

您可以非常轻松地添加缺少的方法:

class Hpricot::Doc
  def self.yaml_tag_subclasses?
    "true"
  end
end
b = Hpricot.XML(open('blogs.xml'))

但您会发现这并没有让您走得更远。这是我得到的结果:

--- !ruby/object:Hpricot::Doc 
options: 
  :xml: true

所以我们没有像我们应该的那样迭代容器。

此时,要使用 YAML 库获得 YAML 支持,强力方法(可能是唯一方法)是向 Hpricot 的类添加 to_yaml 方法,以教它们如何正确输出 YAML。看一下“/usr/lib/ruby/1.8/yaml/rubytypes.rb”(在 Mac 上,类似于“/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib”) /ruby/1.8/yaml/rubytypes.rb")了解如何为每种基本 Ruby 类型完成此操作的示例。您可能需要将其添加到的类在 C 端定义:请参阅方法 Init_hpricot_scan 中的“hpricot/ext/hpricot_scan/hpricot_scan.rl”。

Sorry, Josh, I think what you've found here is a limitation in the Hpricot and/or the YAML libraries, pure and simple.

I'm not sure Hpricot's ever supported YAML in this way. The method in question is dynamically added by the YAML library to the Object class, as well as other fundamental Ruby types, but doesn't show up in Hpricot::Doc's definition for some reason, even though Hpricot::Doc does seem to inherit indirectly from Object.

I can say that I've reproduced it as well, so it's not just you.

You can very easily add the missing method:

class Hpricot::Doc
  def self.yaml_tag_subclasses?
    "true"
  end
end
b = Hpricot.XML(open('blogs.xml'))

but you'll find that doesn't get you much further. Here's what I get:

--- !ruby/object:Hpricot::Doc 
options: 
  :xml: true

So we're not iterating over the container like we should.

At this point, to get YAML support using the YAML library, the brute-force way (maybe the only way) would be to add to_yaml methods to Hpricot's classes, to teach them how to output YAML correctly. Take a look at "/usr/lib/ruby/1.8/yaml/rubytypes.rb" (on a Mac, that'd be something like "/System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/yaml/rubytypes.rb") for examples of how that's done for each of the fundamental Ruby types. The classes you might need to add this to are defined on the C side: see "hpricot/ext/hpricot_scan/hpricot_scan.rl", in the method Init_hpricot_scan.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文