使用treetop解析ruby中的tcl数组

发布于 2024-08-24 23:45:53 字数 1646 浏览 9 评论 0原文

我在(我认为是)tcl 数组中有一堆数据。基本上它的形式是{a {bc} d {ef} g}。它只嵌套一层,但并不总是嵌套,也就是说,a可能只是a,也可能是{aa bb} 或可能 {},但绝不会 {aa {bb cc}}。我想提取这个数组,以便我可以在 ruby​​ 中使用它。

我的第一个想法是,“没问题,我会写一些语法来解析它。”我安装了treetop gem,并编写了一个解析器,它看起来工作得很好。当我尝试从解析树中提取数组时,我开始遇到问题。我想更好地了解问题的原因以及我做错了什么。

这是到目前为止我的解析器代码: (tcl_array.treetop)

grammar TCLArray
  rule array
    "{" [\s]* "}" {
      def content
        []
      end
    }
    /
    "{" [\s]* array_element_list [\s]* "}" {
      def content
        array_element_list.content
      end
    }
  end

  rule array_element_list
    array_element {
      def content
        [array_element.content]
      end
    }
    /
    array_element [\s]+ array_element_list {
      def content
        [array_element.content] + array_element_list.content
      end
    }
  end

  rule array_element
    [^{}\s]+ {
      def content
        return text_value
      end
    }
    /
    array {
      def content
        array.content
      end
    }
  end
end

调用 p.parse("{a}").content 产生 tcl_array.rb:99:in 'content': undefined local变量或方法 'array_element'

array_element_list (array_element) 中的第一项表示 array_element 是一个未定义的局部变量,但访问器方法应该根据树顶文档自动定义。

早些时候,我尝试了一种基于规则较少但稍微复杂的语法的解决方案:

grammar TCLArray
  rule array
    "{" ([\s]* array_element ([\s]+ array_element)* )? [\s]* "}"
  end

  rule array_element
    [^{}\s]+ / array
  end
end

但是使用这种语法时,我遇到了问题,解析器似乎为数组规则创建了几个不同的表达式,即使它没有使用任何替代方案表达式 (/)。结果是我无法弄清楚如何访问数组规则的各个位以将它们作为 ruby​​ 数组返回。

I have a bunch of data in (what i think is) a tcl array. Basically it's in the form of {a {b c} d {e f} g}. It's only nested one deep, but isn't always nested, that is to say, a may just be a or it may be {aa bb} or possibly {}, but never {aa {bb cc}}. I want to extract this array so I can use it in ruby.

My first thought was, "No problem, I'll write a little grammar to parse this." I installed the treetop gem, and wrote up a parser, which seemed to work just fine. I started having problems when I tried to extract an array from the parsed tree. I would like to better understand the cause of the problems and what I am doing wrong.

Here is my parser code so far: (tcl_array.treetop)

grammar TCLArray
  rule array
    "{" [\s]* "}" {
      def content
        []
      end
    }
    /
    "{" [\s]* array_element_list [\s]* "}" {
      def content
        array_element_list.content
      end
    }
  end

  rule array_element_list
    array_element {
      def content
        [array_element.content]
      end
    }
    /
    array_element [\s]+ array_element_list {
      def content
        [array_element.content] + array_element_list.content
      end
    }
  end

  rule array_element
    [^{}\s]+ {
      def content
        return text_value
      end
    }
    /
    array {
      def content
        array.content
      end
    }
  end
end

Invoking p.parse("{a}").content yields tcl_array.rb:99:in 'content': undefined local variable or method 'array_element'

The first term in array_element_list (array_element) says that array_element is an undefined local variable, but accessor methods are supposed to be automatically defined according to the treetop documentation.

Earlier, I tried a solution that was based off of a grammar with fewer but slightly more complicated rules:

grammar TCLArray
  rule array
    "{" ([\s]* array_element ([\s]+ array_element)* )? [\s]* "}"
  end

  rule array_element
    [^{}\s]+ / array
  end
end

But with this grammar I had issues where the parser seemed to be creating several different expressions for the array rule even though it did not use any alternative expressions (/). The result was that I couldn't figure out how to access the various bits of the array rule to return them as a ruby array.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

新人笑 2024-08-31 23:45:53

在这种情况下,解析器生成器可能有点过分了。 是一个简单的手动递归下降解析器,基于 James Edward Gray II 的 JSON 解析器

#!/usr/bin/env ruby
# based on James Edward Gray II's solution to the Parsing JSON
#   Ruby Quiz #155: <http://RubyQuiz.Com/quiz155.html>

require 'strscan'

class TclArrayParser < StringScanner
  def parse
    parse_value
  ensure
    eos? or error "Unexpected data: '#{rest}'"
  end

  private

  def parse_value
    trim_space
    parse_string or parse_array
  ensure
    trim_space
  end

  def parse_array
    return nil unless scan(/\{\s*/)
    array = []
    while contents = parse_value
      array << contents
    end
    scan(/\}/) or error('Unclosed array')
    array
  end

  def parse_string
    scan(/[^{}[:space:]]+/)
  end

  def trim_space
    skip(/\s*/)
  end

  def error(message)
    pos = if eos? then 'end of input' else "position #{self.pos}" end
    raise ParseError, "#{message} at #{pos}"
  end

  class ParseError < StandardError; end
end

这 测试套件:

require 'test/unit'
class TestTclArrayParser < Test::Unit::TestCase
  def test_that_an_empty_string_parses_to_nil
    assert_nil TclArrayParser.new('').parse
  end
  def test_that_a_whitespace_string_parses_to_nil
    assert_nil TclArrayParser.new("  \t  \n  ").parse
  end
  def test_that_an_empty_array_parses_to_an_empty_array
    assert_equal [], TclArrayParser.new('{}').parse
  end
  def test_that_an_empty_array_with_whitespace_at_the_front_parses_to_an_empty_array
    assert_equal [], TclArrayParser.new(' {}').parse
  end
  def test_that_an_empty_array_with_whitespace_at_the_end_parses_to_an_empty_array
    assert_equal [], TclArrayParser.new('{} ').parse
  end
  def test_that_an_empty_array_with_whitespace_inside_parses_to_an_empty_array
    assert_equal [], TclArrayParser.new('{ }').parse
  end
  def test_that_an_empty_array_surrounded_by_whitespace_parses_to_an_empty_array
    assert_equal [], TclArrayParser.new(' {} ').parse
  end
  def test_that_an_empty_array_with_whitespace_at_the_front_and_inside_parses_to_an_empty_array
    assert_equal [], TclArrayParser.new(' { }').parse
  end
  def test_that_an_empty_array_with_whitespace_at_the_end_and_inside_parses_to_an_empty_array
    assert_equal [], TclArrayParser.new('{ } ').parse
  end
  def test_that_an_empty_array_surrounded_by_whitespace_with_whitespace_inside_parses_to_an_empty_array
    assert_equal [], TclArrayParser.new(' { } ').parse
  end
  def test_that_a_sole_element_parses
    assert_equal 'a', TclArrayParser.new('a').parse
  end
  def test_that_an_array_with_one_element_parses
    assert_equal ['a'], TclArrayParser.new('{a}').parse
  end
  def test_that_a_nested_array_parses
    assert_equal [[]], TclArrayParser.new('{{}}').parse
  end
  def test_that_a_nested_array_with_one_element_parses
    assert_equal [['a']], TclArrayParser.new('{{a}}').parse
  end
  def test_that_whitespace_is_ignored
    assert_equal [], TclArrayParser.new('     {     }     ').parse
  end
  def test_that_complex_arrays_parse_correctly
    assert_equal ['a', %w[b c], 'd', %w[e f], 'g'], TclArrayParser.new('{a {b c} d {e f} g}').parse
    assert_equal [%w[aa bb], %w[b c], 'd', %w[e f], 'g'], TclArrayParser.new('{{aa bb} {b c} d {e f} g}').parse
    assert_equal [[], %w[b c], 'd', %w[e f], 'g'], TclArrayParser.new('{{} {b c} d {e f} g}').parse
    assert_equal [[], ['b', 'c'], 'd', ['e', 'f'], 'g'], TclArrayParser.new("\n{\n{\n}\n{\nb\nc\n}\nd\n{\ne\nf\n}\ng\n}\n").parse
  end
end

Maybe a parser generator is overkill in this case. Here's a simple hand-rolled recursive-descent parser based on this JSON parser by James Edward Gray II:

#!/usr/bin/env ruby
# based on James Edward Gray II's solution to the Parsing JSON
#   Ruby Quiz #155: <http://RubyQuiz.Com/quiz155.html>

require 'strscan'

class TclArrayParser < StringScanner
  def parse
    parse_value
  ensure
    eos? or error "Unexpected data: '#{rest}'"
  end

  private

  def parse_value
    trim_space
    parse_string or parse_array
  ensure
    trim_space
  end

  def parse_array
    return nil unless scan(/\{\s*/)
    array = []
    while contents = parse_value
      array << contents
    end
    scan(/\}/) or error('Unclosed array')
    array
  end

  def parse_string
    scan(/[^{}[:space:]]+/)
  end

  def trim_space
    skip(/\s*/)
  end

  def error(message)
    pos = if eos? then 'end of input' else "position #{self.pos}" end
    raise ParseError, "#{message} at #{pos}"
  end

  class ParseError < StandardError; end
end

Here's a testsuite:

require 'test/unit'
class TestTclArrayParser < Test::Unit::TestCase
  def test_that_an_empty_string_parses_to_nil
    assert_nil TclArrayParser.new('').parse
  end
  def test_that_a_whitespace_string_parses_to_nil
    assert_nil TclArrayParser.new("  \t  \n  ").parse
  end
  def test_that_an_empty_array_parses_to_an_empty_array
    assert_equal [], TclArrayParser.new('{}').parse
  end
  def test_that_an_empty_array_with_whitespace_at_the_front_parses_to_an_empty_array
    assert_equal [], TclArrayParser.new(' {}').parse
  end
  def test_that_an_empty_array_with_whitespace_at_the_end_parses_to_an_empty_array
    assert_equal [], TclArrayParser.new('{} ').parse
  end
  def test_that_an_empty_array_with_whitespace_inside_parses_to_an_empty_array
    assert_equal [], TclArrayParser.new('{ }').parse
  end
  def test_that_an_empty_array_surrounded_by_whitespace_parses_to_an_empty_array
    assert_equal [], TclArrayParser.new(' {} ').parse
  end
  def test_that_an_empty_array_with_whitespace_at_the_front_and_inside_parses_to_an_empty_array
    assert_equal [], TclArrayParser.new(' { }').parse
  end
  def test_that_an_empty_array_with_whitespace_at_the_end_and_inside_parses_to_an_empty_array
    assert_equal [], TclArrayParser.new('{ } ').parse
  end
  def test_that_an_empty_array_surrounded_by_whitespace_with_whitespace_inside_parses_to_an_empty_array
    assert_equal [], TclArrayParser.new(' { } ').parse
  end
  def test_that_a_sole_element_parses
    assert_equal 'a', TclArrayParser.new('a').parse
  end
  def test_that_an_array_with_one_element_parses
    assert_equal ['a'], TclArrayParser.new('{a}').parse
  end
  def test_that_a_nested_array_parses
    assert_equal [[]], TclArrayParser.new('{{}}').parse
  end
  def test_that_a_nested_array_with_one_element_parses
    assert_equal [['a']], TclArrayParser.new('{{a}}').parse
  end
  def test_that_whitespace_is_ignored
    assert_equal [], TclArrayParser.new('     {     }     ').parse
  end
  def test_that_complex_arrays_parse_correctly
    assert_equal ['a', %w[b c], 'd', %w[e f], 'g'], TclArrayParser.new('{a {b c} d {e f} g}').parse
    assert_equal [%w[aa bb], %w[b c], 'd', %w[e f], 'g'], TclArrayParser.new('{{aa bb} {b c} d {e f} g}').parse
    assert_equal [[], %w[b c], 'd', %w[e f], 'g'], TclArrayParser.new('{{} {b c} d {e f} g}').parse
    assert_equal [[], ['b', 'c'], 'd', ['e', 'f'], 'g'], TclArrayParser.new("\n{\n{\n}\n{\nb\nc\n}\nd\n{\ne\nf\n}\ng\n}\n").parse
  end
end
窗影残 2024-08-31 23:45:53

注意到这一点仅供参考,但我刚刚发布了一个用于解析简单 TCL 的 gem。

https://github.com/julik/tickly

Noting this for reference, but I've just released a gem for parsing simple TCL.

https://github.com/julik/tickly

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文