检测 BinaryExpression 中的括号
我正在构建一个表达式分析器,我想从中生成数据库查询代码,我已经走了很远,但仍无法准确解析 BinaryExpressions。将它们分成左和右非常容易,但我需要检测括号并相应地生成我的代码,但我不知道如何做到这一点。
一个例子[请忽略有缺陷的逻辑:)]:
a => a.Line2 != "1" && (a.Line2 == "a" || a.Line2 != "b") && !a.Line1.EndsWith("a")
我需要检测中间的“集合”并保留它们的分组,但在解析过程中我看不到表达式与正常 BinaryExpression 的任何差异(我不想检查字符串括号的表示)
任何帮助将不胜感激。
(我可能应该提到我正在使用 C#)
--编辑-- 我没有提到我正在使用标准 .Net 表达式类来构建表达式(System.Linq.Expressions 命名空间)
--Edit2-- 好吧,我不是将文本解析为代码,而是将代码解析为文本。所以我的 Parser 类有一个像这样的方法:
void FilterWith<T>(Expression<Func<T, bool>> filterExpression);
它允许您编写这样的代码:
FilterWith<Customer>(c => c.Name =="asd" && c.Surname == "qwe");
使用标准 .Net 类很容易解析,我的挑战是解析这个表达式:
FilterWith<Customer>(c => c.Name == "asd" && (c.Surname == "qwe" && c.Status == 1) && !c.Disabled)
我的挑战是将括号之间的表达式保留为单套。 .Net 类正确地将括号部分与其他部分分开,但由于括号而没有表明它是一个集合。
I am building a expression analyser from which I would like to generate database query code, I've gotten quite far but am stuck parsing BinaryExpressions accurately. It's quite easy to break them up into Left and Right but I need to detect parenthesis and generate my code accordingly and I cannot see how to do this.
An example [please ignore the flawed logic :)]:
a => a.Line2 != "1" && (a.Line2 == "a" || a.Line2 != "b") && !a.Line1.EndsWith("a")
I need to detect the 'set' in the middle and preserve their grouping but I cannot see any difference in the expression to a normal BinaryExpression during parsing (I would hate to check the string representation for parenthesis)
Any help would be appreciated.
(I should probably mention that I'm using C#)
--Edit--
I failed to mention that I'm using the standard .Net Expression classes to build the expressions (System.Linq.Expressions namespace)
--Edit2--
Ok I'm not parsing text into code, I'm parsing code into text. So my Parser class has a method like this:
void FilterWith<T>(Expression<Func<T, bool>> filterExpression);
which allows you to write code like this:
FilterWith<Customer>(c => c.Name =="asd" && c.Surname == "qwe");
which is quite easy to parse using the standard .Net classes, my challenge is parsing this expression:
FilterWith<Customer>(c => c.Name == "asd" && (c.Surname == "qwe" && c.Status == 1) && !c.Disabled)
my challenge is to keep the expressions between parenthesis as a single set. The .Net classes correctly splits the parenthesis parts from the others but gives no indication that it is a set due to the parenthesis.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我自己没有使用过 Expression,但如果它像任何其他 AST 一样工作,那么问题就比您想象的更容易解决。正如另一位评论者指出的那样,只需将括号括起来所有二进制表达式,然后您就不必担心操作顺序问题。
或者,您可以检查正在生成的表达式的优先级是否低于包含表达式的优先级,如果是,则在其两边加上括号。因此,如果您有一个像这样的树
[* 4 [+ 5 6]]
(其中树节点递归地表示为[node left-subtree right-subtree]
),当写出[+ 4 5]
树时,您会知道它包含在*
操作中,该操作的优先级高于+
操作,因此需要将其任何直接子树放在括号中。伪代码可能是这样的:您需要有一个各种运算符的优先级表,以及一种获取运算符本身以找出它是什么及其优先级的方法。不过,我想你可以弄清楚那部分。
I haven't used Expression myself, but if it works anything like any other AST, then the problem is easier to solve than you make it out to be. As another commentor pointed out, just put parentheses around all of your binary expressions and then you won't have to worry about order of operations issues.
Alternatively, you could check to see if the expression you are generating is at a lower precedence than the containing expression and if so, put parenthesis around it. So if you have a tree like this
[* 4 [+ 5 6]]
(where tree nodes are represented recursively as[node left-subtree right-subtree]
), you would know when writing out the[+ 4 5]
tree that it was contained inside a*
operation, which is higher precedence than a+
operation and thus requires than any of its immediate subtrees be placed in parentheses. The pseudo-code could be something like this:You'll need to have a table of precedence for the various operators, and a way to get at the operator itself to find out what it is and thence what its precedence is. However, I imagine you can figure that part out.
构建表达式分析器时,您首先需要一个解析器,为此您需要一个分词器。
分词器是一段代码,它读取表达式,为确定的语法生成标记(可以是有效的或无效的)。
因此,您的解析器使用分词器按既定顺序(从左到右、从右到左、从上到下,无论您选择什么)读取表达式,并创建一个映射表达式的树。
然后分析器将树解释为表达式,给出其确定的含义。
When building a expression analyzer, you need first a parser, and for that you need a tokenizer.
A tokenizer is a piece of code that reading an expression, generates tokens (which can be valid or invalid), for a determined syntax.
So your parser, using the tokenizer, reads the expression in the established order (left-to right, right-to-left, top-to-bottom, whatever you choose) and creates a tree that maps the expression.
Then the analyzer interprets the tree into an expression, giving its definitive meaning.