区分逻辑运算符和其他中缀运算符
我正在尝试解析 SQL 搜索条件,但无法让解析器区分逻辑(AND
、OR
)与其他中缀运算符。我将它们解析为不同的节点(也许这很难做到),但简化了评估阶段。这是相关的代码片段(如果需要,我可以添加更多代码)。
let opp = OperatorPrecedenceParser<_,_,_>()
let scalarExpr = opp.ExpressionParser
opp.TermParser <- constant <|> id <|> between lparen rparen scalarExpr <|> scalarExpr
//infix operators added here
let comparison = //(e.g., 1 < 2)
let compareExpr = pipe3 scalarExpr compareOp scalarExpr (fun l op r -> Comparison(op, l, r))
between lparen rparen compareExpr <|> compareExpr
let andTerm = pstringCI "and" .>> ws
let orTerm = pstringCI "or" .>> ws
let searchCondition, searchConditionRef = createParserForwardedToRef()
searchConditionRef :=
[ comparison
pipe3 searchCondition andTerm searchCondition (fun l _ r -> And(l, r))
pipe3 searchCondition orTerm searchCondition (fun l _ r -> Or(l, r))
between lparen rparen searchCondition ]
|> choice
let filter : Parser<_,unit> = ws >>. searchCondition .>> eof
“1 = 1”
正确解析为 Comparison (Eq,Constant (Int32 1),Constant (Int32 1))
但一旦我尝试使用逻辑运算符连接两个比较,例如,“1 = 1 or 2 = 2”
,它无法解析
Ln 错误:1 Col:7
1 = 1 或 2 = 2
^
预期:输入结束或中缀运算符
: 7
我希望它将错误之前的 1
解析为标量表达式,并在点击 or
回溯时,意识到它不是中缀运算符,返回 1 作为完整标量,并识别它正在解析由逻辑运算符
or
连接的条件的左侧。
相反,它似乎继续假设 1
开始一个更复杂的标量表达式,可能涉及中缀运算符。
代码是否存在问题,或者是将 AND
/OR
解析为中缀运算符的解决方案(使用相同的 OperatorPrecedenceParser
)?我不想走那条路,所以我希望我在某个地方犯了一个简单的错误。
完整代码是要点。
I'm trying to parse SQL search conditions and having trouble getting the parser to differentiate logical (AND
, OR
) from other infix operators. I'm parsing them as different nodes (perhaps that's difficult to do), but simplifies the evaluation phase. Here's the relevant code snippet (I can include more if necessary).
let opp = OperatorPrecedenceParser<_,_,_>()
let scalarExpr = opp.ExpressionParser
opp.TermParser <- constant <|> id <|> between lparen rparen scalarExpr <|> scalarExpr
//infix operators added here
let comparison = //(e.g., 1 < 2)
let compareExpr = pipe3 scalarExpr compareOp scalarExpr (fun l op r -> Comparison(op, l, r))
between lparen rparen compareExpr <|> compareExpr
let andTerm = pstringCI "and" .>> ws
let orTerm = pstringCI "or" .>> ws
let searchCondition, searchConditionRef = createParserForwardedToRef()
searchConditionRef :=
[ comparison
pipe3 searchCondition andTerm searchCondition (fun l _ r -> And(l, r))
pipe3 searchCondition orTerm searchCondition (fun l _ r -> Or(l, r))
between lparen rparen searchCondition ]
|> choice
let filter : Parser<_,unit> = ws >>. searchCondition .>> eof
"1 = 1"
correctly parses to Comparison (Eq,Constant (Int32 1),Constant (Int32 1))
but once I try to join two comparisons with a logical operator, e.g., "1 = 1 or 2 = 2"
, it fails to parse with
Error in Ln: 1 Col: 7
1 = 1 or 2 = 2
^
Expecting: end of input or infix operator
: 7
I expected it to parse the 1
before the error as a scalar expression and upon hitting or
backtrack, realizing it's not an infix operator, return 1
as the complete scalar, and recognize it's parsing the left-hand side of a condition joined by logical operator or
.
Instead, it seems to continue assuming 1
begins a more complex scalar expression, possibly involving an infix operator.
Is there a problem with the code, or is the solution to parse AND
/OR
as infix operators (using the same OperatorPrecedenceParser
)? I'd rather not go that route, so I'm hoping I've made a simple mistake somewhere.
The complete code is on gist.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我认为最终您会发现您需要将
and
和or
视为具有优先级规则的中缀运算符,因为这正是它们的本质,也是大多数解析器(包括 fparsec)的原因和 fsyacc 有处理它们的特殊功能(即通过优先级和关联性规则解决歧义)。您发现了一个突出显示这一点的案例,但请考虑另一种情况:
应该解析为
(1 = 1 or 2 = 2) and 3 = 3
或1 = 1 or (2 = 2 and 3 = 3)?
I think ultimately you'll find you need to treat
and
andor
as infix operators with precedence rules because that is exactly what they are and is the reason why most parsers including fparsec and fsyacc have special features to handle them (i.e. resolve ambiguity through precedence and associativity rules).You've found one case highlighting this, but consider another:
should that parse as
(1 = 1 or 2 = 2) and 3 = 3
or1 = 1 or (2 = 2 and 3 = 3)
?您的解析器在第一个方程之后停止,因为
searchCondition
的choice
组合器将第一个参数解析器comparison
应用于输入,并且成功后简单地返回参数解析器的结果。然后您会收到错误,因为filter
无法解析searchCondition
之后的eof
。choice
和<|>
组合器不实现最长匹配规则,并且它们不在之后回溯错误,如教程中所述。所以你的searchCondition
解析器无法工作。另一个问题是您的
searchCondition
解析器是左递归的,因为第二个和第三个choice
参数将尝试再次应用searchCondition
而不之前消耗任何内容输入。左递归会导致堆栈溢出。类似地,有
<|> opp.TermParser 定义末尾的 scalarExpr
是不必要的,可能会导致无限递归。当您将左递归解析器语法转换为 FParsec 时,您需要消除左递归。
修复
searchCondition
解析器的一种方法是分解左手递归-side 表达式:或者更简单:
更新:
语法中也存在括号解析的问题,请参阅下面的评论。
Your parser stops after the first equation, because the
choice
combinator of thesearchCondition
applies the first argument parsercomparison
to the input and upon success simply returns the result of the argument parser. You then get an error becausefilter
fails to parse theeof
after thesearchCondition
.The
choice
and<|>
combinators do not implement a longest match rule and they do not backtrack after an error, as explained in the tutorial. So yoursearchCondition
parser can't work.Another problem is that your
searchCondition
parser is left-recursive, since the second and thirdchoice
arguments will try to applysearchCondition
again without previously consuming any input. Left-recursion will lead to a stack overflow.Similary, having
<|> scalarExpr
at the end of theopp.TermParser
definition is unnecessary and can lead to infinite recursions.When you translate a left-recursive parser grammar to FParsec, you need to eliminate the left-recursion.
One way to fix the
searchCondition
parser is to factor out the left-hand-side expression:Or even simpler:
Update:
In the grammar there's also a problem with the parens parsing, see the comments below.
为逻辑运算符创建单独的
OperatorPrecedenceParser
似乎已经解决了这个问题。替换,并且
1 = 1 or (2 = 2 and 3 = 3)
解析正确。我用
(1 = 1 or 2 = 2) and 3 = 3
Creating a separate
OperatorPrecedenceParser
for logical operators seems to have fixed it.I replaced
with
(1 = 1 or 2 = 2) and 3 = 3
and1 = 1 or (2 = 2 and 3 = 3)
parse correctly.