语法歧义:为什么? (问题是:“(a)”与“(az)”)
所以我试图为单行语句实现一种非常简单的语法:
# Grammar
c : Character c [a-z0-9-]
(v) : Vowel (= [a,e,u,i,o])
(c) : Consonant
(?) : Any character (incl. number)
(l) : Any alpha char (= [a-z])
(n) : Any integer (= [0-9])
(c1-c2) : Range from char c1 to char c2
(c1,c2,c3) : List including chars c1, c2 and c3
Examples:
h(v)(c)no(l)(l)jj-k(n)
h(v)(c)no(l)(l)(a)(a)(n)
h(e-g)allo
h(e,f,g)allo
h(x,y,z)uul
h(x,y,z)(x,y,z)(x,y,z)(x,y,z)uul
我正在使用 Happy 解析器生成器(http://www.haskell.org/happy/),但由于某种原因似乎存在一些歧义问题。
错误消息是:“shift/reduce 冲突:1”
我认为这两行有歧义:
| lBracket char rBracket { (\c -> case c of
'v' -> TVowel
'c' -> TConsonant
'l' -> TLetter
'n' -> TNumber) $2 }
| lBracket char hyphen char rBracket { TRange $2 $4 }
一个示例是:“(a)” vs “(az)”
词法分析器会给出对于这两种情况,以下内容:
(a) : [CLBracket, CChar 'a', CRBracket]
(a-z) : [CLBracket, CChar 'a', CHyphen, CChar 'z', CRBracket]
我不明白的是,这如何与 LL[2] 解析器产生歧义。
如果它有帮助,请查看整个 Happy 语法定义:
{
module XHappyParser where
import Data.Char
import Prelude hiding (lex)
import XLexer
import XString
}
%name parse
%tokentype { Character }
%error { parseError }
%token
lBracket { CLBracket }
rBracket { CRBracket }
hyphen { CHyphen }
question { CQuestion }
comma { CComma }
char { CChar $$ }
%%
xstring : tokens { XString (reverse $1) }
tokens : token { [$1] }
| tokens token { $2 : $1 }
token : char { TLiteral $1 }
| hyphen { TLiteral '-' }
| lBracket char rBracket { (\c -> case c of
'v' -> TVowel
'c' -> TConsonant
'l' -> TLetter
'n' -> TNumber) $2 }
| lBracket question rBracket { TAny }
| lBracket char hyphen char rBracket { TRange $2 $4 }
| lBracket listitems rBracket { TList $2 }
listitems : char { [$1] }
| listitems comma char { $1 ++ [$3] }
{
parseError :: [Character] -> a
parseError _ = error "parse error"
}
谢谢!
So I am trying to implement a pretty simple grammar for one-line statements:
# Grammar
c : Character c [a-z0-9-]
(v) : Vowel (= [a,e,u,i,o])
(c) : Consonant
(?) : Any character (incl. number)
(l) : Any alpha char (= [a-z])
(n) : Any integer (= [0-9])
(c1-c2) : Range from char c1 to char c2
(c1,c2,c3) : List including chars c1, c2 and c3
Examples:
h(v)(c)no(l)(l)jj-k(n)
h(v)(c)no(l)(l)(a)(a)(n)
h(e-g)allo
h(e,f,g)allo
h(x,y,z)uul
h(x,y,z)(x,y,z)(x,y,z)(x,y,z)uul
I am using the Happy parser generator (http://www.haskell.org/happy/) but for some reason there seems to be some ambiguity problem.
The error message is: "shift/reduce conflicts: 1"
I think the ambiguity is with these two lines:
| lBracket char rBracket { (\c -> case c of
'v' -> TVowel
'c' -> TConsonant
'l' -> TLetter
'n' -> TNumber) $2 }
| lBracket char hyphen char rBracket { TRange $2 $4 }
An example case is: "(a)" vs "(a-z)"
The lexer would give the following for the two cases:
(a) : [CLBracket, CChar 'a', CRBracket]
(a-z) : [CLBracket, CChar 'a', CHyphen, CChar 'z', CRBracket]
What I don't understand is how this can be ambiguous with an LL[2] parser.
In case it helps here is the entire Happy grammar definition:
{
module XHappyParser where
import Data.Char
import Prelude hiding (lex)
import XLexer
import XString
}
%name parse
%tokentype { Character }
%error { parseError }
%token
lBracket { CLBracket }
rBracket { CRBracket }
hyphen { CHyphen }
question { CQuestion }
comma { CComma }
char { CChar $ }
%%
xstring : tokens { XString (reverse $1) }
tokens : token { [$1] }
| tokens token { $2 : $1 }
token : char { TLiteral $1 }
| hyphen { TLiteral '-' }
| lBracket char rBracket { (\c -> case c of
'v' -> TVowel
'c' -> TConsonant
'l' -> TLetter
'n' -> TNumber) $2 }
| lBracket question rBracket { TAny }
| lBracket char hyphen char rBracket { TRange $2 $4 }
| lBracket listitems rBracket { TList $2 }
listitems : char { [$1] }
| listitems comma char { $1 ++ [$3] }
{
parseError :: [Character] -> a
parseError _ = error "parse error"
}
Thank you!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
这里存在歧义:
您的解析器可以接受
(v)
作为TString [TVowel]
和TString [TList ['v']]
,而不是提及该case
表达式中缺少的字符。解决这个问题的一种可能的方法是修改你的语法,使列表至少有两个项目,或者对元音、辅音等有一些不同的表示法。
Here's the ambiguity:
Your parser could accept
(v)
as bothTString [TVowel]
andTString [TList ['v']]
, not to mention the missing characters in thatcase
expression.One possible way of solving it would be to modify your grammar so lists are at least two items, or have some different notation for vowels, consonants, etc.
问题似乎是:
或者用更简洁的语法:
可以是 TVowel、TConsonant、TLetter、TNumber (如您所知)或单例 TList。
正如快乐手册所说,减少轮班通常不是问题。如果您愿意,您可以优先强制行为/删除警告。
The problem seems to be:
or in cleaner syntax:
Can be a TVowel, TConsonant, TLetter, TNumber (as you know) or a singleton TList.
As the happy manual says, shift reduce usually isn't an issue. You can us precedence to force behavior/remove the warning if you'd like.