在树顶查找单词 - 有些匹配项未匹配
我遇到了一个有点奇怪的情况。
我正在尝试使用树顶解析测量值。
例如 - 6' 1/2" 铜管 当然,这也可以写成英尺,英尺,英寸,英寸,英寸,英寸等等。
所以我有一个规则
rule measurement ('\'' / 'Foot' / 'foot' / 'Feet' / 'feet' / '"' / 'Inches' / 'inches' / 'Inch' / 'inch' / 'cm' / 'cms' / 'Centimeters' / 'centimeters' / 'Centimeter' / 'centimeter' / 'm' / 'ms' / 'Meters' / 'meters'/ 'Meter' / 'meter' / 'lb' / 'lbs' / 'Pounds' / 'pounds' / 'Pound' / 'pound' ) (s? ')' / s) { def value [:measurement, text_value] end } end rule space [\s]+ end
当我输入“6英寸”,“6磅”,“6米”时,一切正常太好了,我得到了我的号码和尺寸。
当我输入“6 米”时,米未正确解析。
大多数测量工作正常,只有“米”和“磅”在我在这里提供的测量中被遗漏了(但我确信我将来会添加更多测量。
关于为什么我会这样做的任何想法正在经历这个吗?
根据要求,完整语法的更“精简”版本
grammar FullMeasurements rule full_product measures s? alternate_measure product_name { def value [:full_product, text_value] end } end rule measures single_measure / dual_measure / quantity { def measures [:measures, text_value] unless text_value.blank? end } end rule dual_measure quantity s? single_measure { def value [:dual_measure, text_value] unless text_value.blank? end } end rule alternate_measure '(' s? single_measure { def value [:alternate_measure, text_value] unless text_value.blank? end } end rule single_measure (range_number / number) s? measurement optional_secondary_measurements { def value [:single_measure, text_value] end } end rule optional_secondary_measurements measurement? { def value [:optional_secondary_measurements, text_value] end } end rule quantity (range_number / number) s? divisor? { def value [:quantity, text_value] end } end rule measurement ('\'' / 'Foot' / 'foot' / 'Feet' / 'feet' / '"' / 'Inches' / 'inches' / 'Inch' / 'inch' / 'cm' / 'cms' / 'Centimeters' / 'centimeters' / 'Centimeter' / 'centimeter' / 'm' / 'ms' / 'Meters' / 'meters'/ 'Meter' / 'meter' / 'lb' / 'lbs' / 'Pounds' / 'pounds' / 'Pound' / 'pound' ) (s? ')' / s) { def value [:measurement, text_value] end } end rule divisor "x" end rule product_name !measures words+ { def value [:product_name, text_value] end } end rule number frac_number / regular_number optional_frac { def value [:number, text_value] end } end rule optional_frac frac_number? { def value [:optional_frac, text_value] end } end rule frac_number (s? regular_number '/' regular_number) { def value [:frac_number, text_value] end } end rule words [0-9a-zA-Z\-()&.%'*\s]+ { def value text_value end } end rule regular_number [0-9\.]+ { def value text_value end } end rule space [\s]+ end end
I've run into a bit of a strange situation.
I'm trying to parse measurements using treetop.
For instance - 6' of 1/2" Copper Pipe
of course, this can also be written as feet, Feet, inch, inches, Inch, inch, etc. etc.
so I have a rule
rule measurement ('\'' / 'Foot' / 'foot' / 'Feet' / 'feet' / '"' / 'Inches' / 'inches' / 'Inch' / 'inch' / 'cm' / 'cms' / 'Centimeters' / 'centimeters' / 'Centimeter' / 'centimeter' / 'm' / 'ms' / 'Meters' / 'meters'/ 'Meter' / 'meter' / 'lb' / 'lbs' / 'Pounds' / 'pounds' / 'Pound' / 'pound' ) (s? ')' / s) { def value [:measurement, text_value] end } end rule space [\s]+ end
When I enter '6 inches', '6 pounds', '6 Meters', everything works great, and I get my number and measurement returned.
When I enter '6 meters', meters isn't parsed properly.
Most of the measurements work fine, only 'meters' and 'pound' are being missed in the measurements I've provided here (but I'm sure I'll be adding more measurements in the future.
Any ideas as to why I would be experiencing this?
As per request, a more 'pared down' version of the full grammar
grammar FullMeasurements rule full_product measures s? alternate_measure product_name { def value [:full_product, text_value] end } end rule measures single_measure / dual_measure / quantity { def measures [:measures, text_value] unless text_value.blank? end } end rule dual_measure quantity s? single_measure { def value [:dual_measure, text_value] unless text_value.blank? end } end rule alternate_measure '(' s? single_measure { def value [:alternate_measure, text_value] unless text_value.blank? end } end rule single_measure (range_number / number) s? measurement optional_secondary_measurements { def value [:single_measure, text_value] end } end rule optional_secondary_measurements measurement? { def value [:optional_secondary_measurements, text_value] end } end rule quantity (range_number / number) s? divisor? { def value [:quantity, text_value] end } end rule measurement ('\'' / 'Foot' / 'foot' / 'Feet' / 'feet' / '"' / 'Inches' / 'inches' / 'Inch' / 'inch' / 'cm' / 'cms' / 'Centimeters' / 'centimeters' / 'Centimeter' / 'centimeter' / 'm' / 'ms' / 'Meters' / 'meters'/ 'Meter' / 'meter' / 'lb' / 'lbs' / 'Pounds' / 'pounds' / 'Pound' / 'pound' ) (s? ')' / s) { def value [:measurement, text_value] end } end rule divisor "x" end rule product_name !measures words+ { def value [:product_name, text_value] end } end rule number frac_number / regular_number optional_frac { def value [:number, text_value] end } end rule optional_frac frac_number? { def value [:optional_frac, text_value] end } end rule frac_number (s? regular_number '/' regular_number) { def value [:frac_number, text_value] end } end rule words [0-9a-zA-Z\-()&.%'*\s]+ { def value text_value end } end rule regular_number [0-9\.]+ { def value text_value end } end rule space [\s]+ end end
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
由于 PEG 是贪婪的,并且
/
是一个有序交替,因此您的measurement
规则与文字文本“meter”匹配,然后您的语法失败,因为它不能找到与剩下的“s”匹配的以下规则。与正则表达式不同,当后面的匹配失败时,PEG 不会回溯到之前的成功匹配。将规则中的项目顺序切换为首先使用复数形式,这样就可以了。
Since PEGs are greedy and
/
is an ordered alternation, yourmeasurement
rule matches the literal text "meter" and then your grammar fails because it cannot find a following rule that matches the left over "s". Unlike regular expressions, PEGs will not backtrack through previous successful matches when a later one fails.Switch the order of items in your rule to have the plurals first, and you should be good to go.
Phrogz 走在正确的轨道上,但首先匹配的不是“meter”,而是“m”,没有留下任何东西可以匹配剩下的“eter”或“eters”。
Phrogz was on the right track, but it's not "meter" being matched first, but 'm' that leaves nothing to match the "eter" or "eters" that's left over.