区分逻辑运算符和其他中缀运算符

发布于 2025-01-03 22:06:37 字数 1866 浏览 6 评论 0原文

我正在尝试解析 SQL 搜索条件,但无法让解析器区分逻辑(ANDOR)与其他中缀运算符。我将它们解析为不同的节点(也许这很难做到),但简化了评估阶段。这是相关的代码片段(如果需要,我可以添加更多代码)。

let opp = OperatorPrecedenceParser<_,_,_>()
let scalarExpr = opp.ExpressionParser
opp.TermParser <- constant <|> id <|> between lparen rparen scalarExpr <|> scalarExpr

//infix operators added here

let comparison = //(e.g., 1 < 2)
  let compareExpr = pipe3 scalarExpr compareOp scalarExpr (fun l op r -> Comparison(op, l, r))
  between lparen rparen compareExpr <|> compareExpr

let andTerm = pstringCI "and" .>> ws
let orTerm = pstringCI "or" .>> ws

let searchCondition, searchConditionRef = createParserForwardedToRef()
searchConditionRef := 
  [ comparison 
    pipe3 searchCondition andTerm searchCondition (fun l _ r -> And(l, r))
    pipe3 searchCondition orTerm searchCondition (fun l _ r -> Or(l, r))
    between lparen rparen searchCondition ]
  |> choice

let filter : Parser<_,unit> = ws >>. searchCondition .>> eof

“1 = 1” 正确解析为 Comparison (Eq,Constant (Int32 1),Constant (Int32 1))

但一旦我尝试使用逻辑运算符连接两个比较,例如,“1 = 1 or 2 = 2”,它无法解析

Ln 错误:1 Col:7
1 = 1 或 2 = 2
       ^
预期:输入结束或中缀运算符
: 7

我希望它将错误之前的 1 解析为标量表达式,并在点击 or 回溯时,意识到它不是中缀运算符,返回 1 作为完整标量,并识别它正在解析由逻辑运算符 or 连接的条件的左侧。

相反,它似乎继续假设 1 开始一个更复杂的标量表达式,可能涉及中缀运算符。

代码是否存在问题,或者是将 AND/OR 解析为中缀运算符的解决方案(使用相同的 OperatorPrecedenceParser)?我不想走那条路,所以我希望我在某个地方犯了一个简单的错误。

完整代码是要点。

I'm trying to parse SQL search conditions and having trouble getting the parser to differentiate logical (AND, OR) from other infix operators. I'm parsing them as different nodes (perhaps that's difficult to do), but simplifies the evaluation phase. Here's the relevant code snippet (I can include more if necessary).

let opp = OperatorPrecedenceParser<_,_,_>()
let scalarExpr = opp.ExpressionParser
opp.TermParser <- constant <|> id <|> between lparen rparen scalarExpr <|> scalarExpr

//infix operators added here

let comparison = //(e.g., 1 < 2)
  let compareExpr = pipe3 scalarExpr compareOp scalarExpr (fun l op r -> Comparison(op, l, r))
  between lparen rparen compareExpr <|> compareExpr

let andTerm = pstringCI "and" .>> ws
let orTerm = pstringCI "or" .>> ws

let searchCondition, searchConditionRef = createParserForwardedToRef()
searchConditionRef := 
  [ comparison 
    pipe3 searchCondition andTerm searchCondition (fun l _ r -> And(l, r))
    pipe3 searchCondition orTerm searchCondition (fun l _ r -> Or(l, r))
    between lparen rparen searchCondition ]
  |> choice

let filter : Parser<_,unit> = ws >>. searchCondition .>> eof

"1 = 1" correctly parses to Comparison (Eq,Constant (Int32 1),Constant (Int32 1))

but once I try to join two comparisons with a logical operator, e.g., "1 = 1 or 2 = 2", it fails to parse with

Error in Ln: 1 Col: 7
1 = 1 or 2 = 2
         ^
Expecting: end of input or infix operator
: 7

I expected it to parse the 1 before the error as a scalar expression and upon hitting or backtrack, realizing it's not an infix operator, return 1 as the complete scalar, and recognize it's parsing the left-hand side of a condition joined by logical operator or.

Instead, it seems to continue assuming 1 begins a more complex scalar expression, possibly involving an infix operator.

Is there a problem with the code, or is the solution to parse AND/OR as infix operators (using the same OperatorPrecedenceParser)? I'd rather not go that route, so I'm hoping I've made a simple mistake somewhere.

The complete code is on gist.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

﹉夏雨初晴づ 2025-01-10 22:06:37

我认为最终您会发现您需要将 andor 视为具有优先级规则的中缀运算符,因为这正是它们的本质,也是大多数解析器(包括 fparsec)的原因和 fsyacc 有处理它们的特殊功能(即通过优先级和关联性规则解决歧义)。

您发现了一个突出显示这一点的案例,但请考虑另一种情况:

1 = 1 or 2 = 2 and 3 =3

应该解析为 (1 = 1 or 2 = 2) and 3 = 31 = 1 or (2 = 2 and 3 = 3)?

I think ultimately you'll find you need to treat and and or as infix operators with precedence rules because that is exactly what they are and is the reason why most parsers including fparsec and fsyacc have special features to handle them (i.e. resolve ambiguity through precedence and associativity rules).

You've found one case highlighting this, but consider another:

1 = 1 or 2 = 2 and 3 =3

should that parse as (1 = 1 or 2 = 2) and 3 = 3 or 1 = 1 or (2 = 2 and 3 = 3)?

衣神在巴黎 2025-01-10 22:06:37

您的解析器在第一个方程之后停止,因为 searchConditionchoice 组合器将第一个参数解析器 comparison 应用于输入,并且成功后简单地返回参数解析器的结果。然后您会收到错误,因为 filter 无法解析 searchCondition 之后的 eof

choice<|> 组合器实现最长匹配规则,并且它们在之后回溯错误,如教程中所述。所以你的 searchCondition 解析器无法工作。

另一个问题是您的 searchCondition 解析器是左递归的,因为第二个和第三个 choice 参数将尝试再次应用 searchCondition 而不之前消耗任何内容输入。左递归会导致堆栈溢出。

类似地,有 <|> opp.TermParser 定义末尾的 scalarExpr 是不必要的,可能会导致无限递归。

当您将左递归解析器语法转换为 FParsec 时,您需要消除左递归。

修复 searchCondition 解析器的一种方法是分解左手递归-side 表达式:

let andTerm = stringCIReturn "and" (fun l r -> And(l, r)) .>> ws
let orTerm = stringCIReturn "or" (fun l r -> Or(l, r)) .>> ws

let searchCondition, searchConditionRef = createParserForwardedToRef()

do searchConditionRef:=
    let comparisonTerm =
        comparison <|> between lparen rparen searchCondition

    pipe2 comparisonTerm (many ((andTerm <|> orTerm) .>>. comparisonTerm)) 
          (fun l opRList -> 
                List.fold (fun l (op, r) -> op l r) l opRList)

或者更简单:

do searchConditionRef:= 
    chainl1 (comparison <|> between lparen rparen searchCondition)
            (andTerm <|> orTerm)        

更新:
语法中也存在括号解析的问题,请参阅下面的评论。

Your parser stops after the first equation, because the choice combinator of the searchCondition applies the first argument parser comparison to the input and upon success simply returns the result of the argument parser. You then get an error because filter fails to parse the eof after the searchCondition.

The choice and <|> combinators do not implement a longest match rule and they do not backtrack after an error, as explained in the tutorial. So your searchCondition parser can't work.

Another problem is that your searchCondition parser is left-recursive, since the second and third choice arguments will try to apply searchCondition again without previously consuming any input. Left-recursion will lead to a stack overflow.

Similary, having <|> scalarExpr at the end of the opp.TermParser definition is unnecessary and can lead to infinite recursions.

When you translate a left-recursive parser grammar to FParsec, you need to eliminate the left-recursion.

One way to fix the searchCondition parser is to factor out the left-hand-side expression:

let andTerm = stringCIReturn "and" (fun l r -> And(l, r)) .>> ws
let orTerm = stringCIReturn "or" (fun l r -> Or(l, r)) .>> ws

let searchCondition, searchConditionRef = createParserForwardedToRef()

do searchConditionRef:=
    let comparisonTerm =
        comparison <|> between lparen rparen searchCondition

    pipe2 comparisonTerm (many ((andTerm <|> orTerm) .>>. comparisonTerm)) 
          (fun l opRList -> 
                List.fold (fun l (op, r) -> op l r) l opRList)

Or even simpler:

do searchConditionRef:= 
    chainl1 (comparison <|> between lparen rparen searchCondition)
            (andTerm <|> orTerm)        

Update:
In the grammar there's also a problem with the parens parsing, see the comments below.

奶气 2025-01-10 22:06:37

为逻辑运算符创建单独的 OperatorPrecedenceParser 似乎已经解决了这个问题。

替换,并且 1 = 1 or (2 = 2 and 3 = 3) 解析正确。

let andTerm = pstringCI "and" .>> ws
let orTerm = pstringCI "or" .>> ws

let searchCondition, searchConditionRef = createParserForwardedToRef()
searchConditionRef := 
  [ comparison 
    pipe3 searchCondition andTerm searchCondition (fun l _ r -> And(l, r))
    pipe3 searchCondition orTerm searchCondition (fun l _ r -> Or(l, r))
    between lparen rparen searchCondition ]
  |> choice

我用

let condOpp = OperatorPrecedenceParser()
let searchCondition = condOpp.ExpressionParser
condOpp.TermParser <- (attempt comparison) <|> between lparen rparen searchCondition <|> searchCondition
condOpp.AddOperator(InfixOperator("or", ws, 1, Assoc.Left, fun l r -> Or(l, r)))    
condOpp.AddOperator(InfixOperator("and", ws, 2, Assoc.Left, fun l r -> And(l, r)))    

(1 = 1 or 2 = 2) and 3 = 3

Creating a separate OperatorPrecedenceParser for logical operators seems to have fixed it.

I replaced

let andTerm = pstringCI "and" .>> ws
let orTerm = pstringCI "or" .>> ws

let searchCondition, searchConditionRef = createParserForwardedToRef()
searchConditionRef := 
  [ comparison 
    pipe3 searchCondition andTerm searchCondition (fun l _ r -> And(l, r))
    pipe3 searchCondition orTerm searchCondition (fun l _ r -> Or(l, r))
    between lparen rparen searchCondition ]
  |> choice

with

let condOpp = OperatorPrecedenceParser()
let searchCondition = condOpp.ExpressionParser
condOpp.TermParser <- (attempt comparison) <|> between lparen rparen searchCondition <|> searchCondition
condOpp.AddOperator(InfixOperator("or", ws, 1, Assoc.Left, fun l r -> Or(l, r)))    
condOpp.AddOperator(InfixOperator("and", ws, 2, Assoc.Left, fun l r -> And(l, r)))    

(1 = 1 or 2 = 2) and 3 = 3 and 1 = 1 or (2 = 2 and 3 = 3) parse correctly.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文