在树顶查找单词 - 有些匹配项未匹配

发布于 2024-12-01 06:04:26 字数 3809 浏览 0 评论 0原文

我遇到了一个有点奇怪的情况。

我正在尝试使用树顶解析测量值。

例如 - 6' 1/2" 铜管 当然,这也可以写成英尺,英尺,英寸,英寸,英寸,英寸等等。

所以我有一个规则

rule measurement
      ('\'' / 'Foot' / 'foot' / 'Feet' / 'feet' / 
       '"' / 'Inches' / 'inches' /  'Inch' / 'inch' /
       'cm' / 'cms' / 'Centimeters' / 'centimeters' /  'Centimeter' / 'centimeter' / 
       'm' / 'ms' / 'Meters' / 'meters'/ 'Meter' / 'meter' / 
       'lb' / 'lbs' /  'Pounds' / 'pounds' / 'Pound' / 'pound' )
       (s? ')' / s) {
                    def value
                          [:measurement, text_value]
                    end
                    }
end

rule space
    [\s]+
end

当我输入“6英寸”,“6磅”,“6米”时,一切正常太好了,我得到了我的号码和尺寸。

当我输入“6 米”时,米未正确解析。

大多数测量工作正常,只有“米”和“磅”在我在这里提供的测量中被遗漏了(但我确信我将来会添加更多测量。

关于为什么我会这样做的任何想法正在经历这个吗?

根据要求,完整语法的更“精简”版本

grammar FullMeasurements
       rule full_product
           measures s? alternate_measure product_name {
             def value
                  [:full_product, text_value]
             end
           }

       end

       rule measures
        single_measure / dual_measure / quantity {
            def measures
                [:measures, text_value] unless text_value.blank?
            end
        }
    end


    rule dual_measure
        quantity s? single_measure {
            def value
                [:dual_measure, text_value] unless text_value.blank?
            end

            }
    end


    rule alternate_measure 
        '(' s? single_measure {
            def value
                [:alternate_measure, text_value] unless text_value.blank?
            end
        }
    end

    rule single_measure 
        (range_number / number) s? measurement optional_secondary_measurements  {
            def value
                [:single_measure, text_value]
            end
        }
    end

    rule optional_secondary_measurements
        measurement? {
            def value
                [:optional_secondary_measurements, text_value]
            end
        }
    end



    rule quantity
        (range_number / number) s? divisor? {
            def value
                [:quantity, text_value]
            end
        }
    end

        rule measurement
              ('\'' / 'Foot' / 'foot' / 'Feet' / 'feet' / 
               '"' / 'Inches' / 'inches' /  'Inch' / 'inch' /
               'cm' / 'cms' / 'Centimeters' / 'centimeters' /  'Centimeter' / 'centimeter' / 
               'm' / 'ms' / 'Meters' / 'meters'/ 'Meter' / 'meter' / 
               'lb' / 'lbs' /  'Pounds' / 'pounds' / 'Pound' / 'pound' )
                (s? ')' / s) {
                    def value
                          [:measurement, text_value]
                    end
                    }
         end



        rule divisor
        "x" 
    end

    rule product_name
            !measures words+ {
            def value
                [:product_name, text_value]
            end
        }
    end


    rule number 
     frac_number / regular_number optional_frac {
            def value
                [:number, text_value]
            end
        }
        end



        rule optional_frac
        frac_number? {
            def value
                [:optional_frac, text_value]
            end
        }
         end



         rule frac_number
        (s? regular_number '/' regular_number)  {
            def value
                [:frac_number, text_value]
            end
        }
        end

        rule words
        [0-9a-zA-Z\-()&.%'*\s]+ {
            def value
                text_value
            end 
        }

          end

        rule regular_number
        [0-9\.]+ {
            def value
                text_value
            end 
        }

        end

        rule space
          [\s]+
         end
end

I've run into a bit of a strange situation.

I'm trying to parse measurements using treetop.

For instance - 6' of 1/2" Copper Pipe
of course, this can also be written as feet, Feet, inch, inches, Inch, inch, etc. etc.

so I have a rule

rule measurement
      ('\'' / 'Foot' / 'foot' / 'Feet' / 'feet' / 
       '"' / 'Inches' / 'inches' /  'Inch' / 'inch' /
       'cm' / 'cms' / 'Centimeters' / 'centimeters' /  'Centimeter' / 'centimeter' / 
       'm' / 'ms' / 'Meters' / 'meters'/ 'Meter' / 'meter' / 
       'lb' / 'lbs' /  'Pounds' / 'pounds' / 'Pound' / 'pound' )
       (s? ')' / s) {
                    def value
                          [:measurement, text_value]
                    end
                    }
end

rule space
    [\s]+
end

When I enter '6 inches', '6 pounds', '6 Meters', everything works great, and I get my number and measurement returned.

When I enter '6 meters', meters isn't parsed properly.

Most of the measurements work fine, only 'meters' and 'pound' are being missed in the measurements I've provided here (but I'm sure I'll be adding more measurements in the future.

Any ideas as to why I would be experiencing this?

As per request, a more 'pared down' version of the full grammar

grammar FullMeasurements
       rule full_product
           measures s? alternate_measure product_name {
             def value
                  [:full_product, text_value]
             end
           }

       end

       rule measures
        single_measure / dual_measure / quantity {
            def measures
                [:measures, text_value] unless text_value.blank?
            end
        }
    end


    rule dual_measure
        quantity s? single_measure {
            def value
                [:dual_measure, text_value] unless text_value.blank?
            end

            }
    end


    rule alternate_measure 
        '(' s? single_measure {
            def value
                [:alternate_measure, text_value] unless text_value.blank?
            end
        }
    end

    rule single_measure 
        (range_number / number) s? measurement optional_secondary_measurements  {
            def value
                [:single_measure, text_value]
            end
        }
    end

    rule optional_secondary_measurements
        measurement? {
            def value
                [:optional_secondary_measurements, text_value]
            end
        }
    end



    rule quantity
        (range_number / number) s? divisor? {
            def value
                [:quantity, text_value]
            end
        }
    end

        rule measurement
              ('\'' / 'Foot' / 'foot' / 'Feet' / 'feet' / 
               '"' / 'Inches' / 'inches' /  'Inch' / 'inch' /
               'cm' / 'cms' / 'Centimeters' / 'centimeters' /  'Centimeter' / 'centimeter' / 
               'm' / 'ms' / 'Meters' / 'meters'/ 'Meter' / 'meter' / 
               'lb' / 'lbs' /  'Pounds' / 'pounds' / 'Pound' / 'pound' )
                (s? ')' / s) {
                    def value
                          [:measurement, text_value]
                    end
                    }
         end



        rule divisor
        "x" 
    end

    rule product_name
            !measures words+ {
            def value
                [:product_name, text_value]
            end
        }
    end


    rule number 
     frac_number / regular_number optional_frac {
            def value
                [:number, text_value]
            end
        }
        end



        rule optional_frac
        frac_number? {
            def value
                [:optional_frac, text_value]
            end
        }
         end



         rule frac_number
        (s? regular_number '/' regular_number)  {
            def value
                [:frac_number, text_value]
            end
        }
        end

        rule words
        [0-9a-zA-Z\-()&.%'*\s]+ {
            def value
                text_value
            end 
        }

          end

        rule regular_number
        [0-9\.]+ {
            def value
                text_value
            end 
        }

        end

        rule space
          [\s]+
         end
end

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

傲世九天 2024-12-08 06:04:26

由于 PEG 是贪婪的,并且 / 是一个有序交替,因此您的 measurement 规则与文字文本“meter”匹配,然后您的语法失败,因为它不能找到与剩下的“s”匹配的以下规则。与正则表达式不同,当后面的匹配失败时,PEG 不会回溯到之前的成功匹配。

将规则中的项目顺序切换为首先使用复数形式,这样就可以了。

Since PEGs are greedy and / is an ordered alternation, your measurement rule matches the literal text "meter" and then your grammar fails because it cannot find a following rule that matches the left over "s". Unlike regular expressions, PEGs will not backtrack through previous successful matches when a later one fails.

Switch the order of items in your rule to have the plurals first, and you should be good to go.

听不够的曲调 2024-12-08 06:04:26

Phrogz 走在正确的轨道上,但首先匹配的不是“meter”,而是“m”,没有留下任何东西可以匹配剩下的“eter”或“eters”。

Phrogz was on the right track, but it's not "meter" being matched first, but 'm' that leaves nothing to match the "eter" or "eters" that's left over.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文