C# algorithm parsing expression-evaluation

从字符串描述构建布尔函数

发布于 2024-12-02 00:25:50 字数 327 浏览 0 评论 0原文

我有一个大型布尔值数据库，并且想要构建一个框架来轻松地对所有值运行查询。为此，我想编写一个函数，给定布尔表达式的字符串表示形式，该函数将在数据库的所有元素上计算该表达式。例如，给定输入

(a && b) || c

函数将构造另一个函数，该函数将计算

return (funcA() && funcB()) || funcC();

其中 funcA、funcB 和 funcC 是返回布尔值的函数

原文

I have a large database of boolean values and want to build a framework for easily running queries over all of the values. To do this, I'd like to write a function that, given a string representation of a boolean expression, would evaluate that expression over all of the elements of the database. For example, given input

(a && b) || c

The function would construct another function that would evaluate

return (funcA() && funcB()) || funcC();

where funcA, funcB, and funcC are functions returning booleans

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

黑白记忆 2024-12-09 00:25:50

这似乎最好分三步完成。

首先，您需要弄清楚您到底应该评估什么。这通常分两个步骤完成，称为扫描和解析。扫描的工作是将输入字符串分解为一系列标记，即构成文本的较小逻辑单元。例如，给定字符串

(a && b)

，您可以将其分解为标记。

(
a
&&
b
)

通常，这是使用正则表达式完成的，尽管您也可以手动完成。主要思想是将确定字符串片段的任务与查看这些片段如何关联的任务分开。

扫描输入后，您需要解析它以确定所说的内容。也就是说，您将把标记重新组装成一个完整的数学表达式，编码运算符优先级、正在使用的操作数等。有很多算法可以做到这一点，但也许其中最简单的是 Dijkstra 的 调车场算法，相当容易实现。您可能会使用抽象语法树存储此解析步骤的输出，这是一种对输入结构进行编码的树结构。

此时，您对要计算的表达式的含义有了明确的解释，并且您需要实际对其进行计算！为此，您可能会为每个 AST 节点定义一些函数来从该节点生成值。对于像 && 这样的运算符，您将评估左右子表达式，然后计算它们的 AND（或者如果 lhs 为 false，则可能使用短路来避免计算 rhs）。对于单个字母，您可以使用反射来调用相应的方法，或者可以有一个将名称映射到函数的表（取决于您想要的安全性）。

作为编码方面的潜在优化，您可能需要考虑省略构造AST 并计算您想要的值。分流场算法（以及许多其他解析器，例如自上而下的 LL(1) 或自下而上的 LR(1) 解析器）通常允许您根据表达式的组成表达式来计算表达式的一些总体值，并且它以这种方式编码可能更容易。但是，如果您计划在数据库等庞大数据集上使用所描述的函数，则计算 AST 将为您提供一个对象，您可以对数据库中的每个值调用该对象以生成您想要的值。

如果您计划对大量数据运行大规模复杂的查询，您甚至可能需要更进一步，将生成的表达式实际转换为 C# 代码，然后将其编译并加载到正在运行的程序中。我见过 Java 中的示例，其中使用此方法产生了巨大的效果，但这是针对非常高性能的应用程序，并且可能有点过大，除非您已经用尽了所有其他选项。

希望这有帮助！

This seems like it is best done in three steps.

First, you need to figure out what exactly you're supposed to evaluate. This is usually done in two steps called scanning and parsing. The job of scanning is to break the input string into a sequence of tokens, smaller logical units that make up the text. For example, given the string

(a && b)

You would break this into the tokens

(
a
&&
b
)

Typically, this is done using regular expressions, though you can do it by hand as well. The main idea is to separate the task of determining the pieces of the string from the task of seeing how those pieces relate.

Once you've scanned the input, you need to parse it to determine what is being said. That is, you will reassemble the tokens into a complete mathematical expression encoding operator precedence, what operands are being used, etc. There are many algorithms to do this, but perhaps the easiest of them is Dijkstra's shunting yard algorithm, which is fairly easy to implement. You would likely store the output of this parsing step using an abstract syntax tree, a tree structure encoding the structure of the input.

At this point, you have an unambiguous interpretation of the meaning of the expression to evaluate and you'll need to actually evaluate it! To do this, you would probably define, for each AST node, some function to produce a value from that node. For operators like &&, you would evaluate the left and right subexpressions and then compute their AND (or perhaps use short-circuiting to avoid computing the rhs if the lhs is false). For individual letters, you'd use reflection to invoke the corresponding method, or could have a table mapping names to functions (depending on the security you want.)

As a potential optimization in terms of coding, you may want to consider omitting construction of the AST and to just compute the values you want as you go. The shunting-yard algorithm (and many other parsers, such as a top-down LL(1) or bottom-up LR(1) parser) usually let you compute some overall value for an expression in terms of its constituent expressions, and it may be easier to code up this way. However, if you're planning on using the described function over a huge data set like a database, computing the AST would give you an object that you could invoke on each value in the database to produce the values you'd like.

If you are planning on running massively complex queries over a huge set of data, you may even want to go one step further and actually convert the generated expression down to C# code that you would then compile and load into the running program. I've seen examples in Java where this was used to great effect, but this was for a very-high performance application and is probably overkill unless you've exhausted all other options.

Hope this helps!

回复收藏 0 原文

沙沙粒小 2024-12-09 00:25:50

好的，这是我选择的解决方案。

我使用以下代码项目

http://www.codeproject.com/KB/dotnet/Expr。 aspx

我获取标志和规则 ID 列表
例如：ArgsList = List; ={"0","&&","5"} // (0&&5)

   int id;
   var tmp = new List<string>();
   //------------------------------//
   foreach( string arg in ArgsList)
   {
       if( ( arg != "&&" && arg != "||" && arg != ")" && arg != "(" ) )
       {
          try
          {
              id = int.Parse(arg);
          }
          catch( Exception ex )
          {
               return false;
          }
          tmp.Add(GetRuleById(id, ref errorString).Check(wwObject, ref errorString).ToString());
       }
       else
       {
            tmp.Add(arg);
       }
  }

  //foreach converts it to List<string> = {"True","&&","False"}
  string stringtoeval;
  stringtoeval = string.Join(string.Empty, tmp.ToArray()).ToLower();//"True&&False"
  return (bool)EvalCSCode.EvalCSCode.Eval(stringtoeval);//returns false

OK here is my selected solution.

I use the following codeproject

http://www.codeproject.com/KB/dotnet/Expr.aspx

I get list of Signs and Rule Ids
for example:ArgsList = List<string> ={"0","&&","5"} // (0&&5)

   int id;
   var tmp = new List<string>();
   //------------------------------//
   foreach( string arg in ArgsList)
   {
       if( ( arg != "&&" && arg != "||" && arg != ")" && arg != "(" ) )
       {
          try
          {
              id = int.Parse(arg);
          }
          catch( Exception ex )
          {
               return false;
          }
          tmp.Add(GetRuleById(id, ref errorString).Check(wwObject, ref errorString).ToString());
       }
       else
       {
            tmp.Add(arg);
       }
  }

  //foreach converts it to List<string> = {"True","&&","False"}
  string stringtoeval;
  stringtoeval = string.Join(string.Empty, tmp.ToArray()).ToLower();//"True&&False"
  return (bool)EvalCSCode.EvalCSCode.Eval(stringtoeval);//returns false

回复收藏 0 原文