用于数学解析的基于堆栈的表达式评估的效率

发布于 2024-08-21 04:49:46 字数 318 浏览 8 评论 0原文

出于学术目的,我必须编写一个应用程序来绘制用户输入表达式,例如: f(x) = 1 - exp(3^(5*ln(cosx)) + x)

我选择编写的方法parser 的作用是通过 Shunting-Yard 算法转换 RPN 中的表达式,将“cos”等原语函数视为一元运算符。这意味着上面编写的函数将被转换为一系列标记,例如:

1, x, cos, ln, 5, *,3, ^, exp, -

问题是要绘制函数,我必须对其进行多次评估,因此对每个输入值应用堆栈评估算法将非常低​​效。 我该如何解决这个问题?我必须忘记 RPN 的想法吗?

I have to write, for academic purposes, an application that plots user-input expressions like: f(x) = 1 - exp(3^(5*ln(cosx)) + x)

The approach I've chosen to write the parser is to convert the expression in RPN with the Shunting-Yard algorithm, treating primitive functions like "cos" as unary operators. This means the function written above would be converted in a series of tokens like:

1, x, cos, ln, 5, *,3, ^, exp, -

The problem is that to plot the function I have to evaluate it LOTS of times, so applying the stack evaluation algorithm for each input value would be very inefficient.
How can I solve this? Do I have to forget the RPN idea?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(9

萝莉病 2024-08-28 04:49:46

“很多次”是多少?一百万?

可以输入什么样的函数?我们可以假设它们是连续的吗?

您是否尝试过衡量代码的性能?

(抱歉,从问题开始!)

您可以尝试下面简要描述的两种方法之一(或两者)(可能还有更多):

1)解析树。

您可以创建一个解析树。然后执行大多数编译器所做的操作来优化表达式、常量折叠、公共子表达式消除(您可以通过将公共表达式子树链接在一起并缓存结果来实现)等。

然后您可以使用惰性求值技术来避免整个子树。例如,如果您有一棵树

    *
   / \
  A   B

,其中 A 的计算结果为 0,您可以完全避免计算 B,因为您知道结果为 0。使用 RPN,您将失去惰性计算的能力。

2) 插值

假设您的函数是连续的,您可以使用多项式插值< /a>.这样,您可以对函数进行几次复杂的计算(基于您选择的多项式的次数),然后在其余时间进行快速多项式计算。

要创建初始数据集,您可以仅使用方法 1 或仅坚持使用 RPN,因为您只会生成几个值。

因此,如果您使用插值,您可以保留您的 RPN...

希望有帮助!

How much is "LOTS of times"? A million?

What kind of functions could be input? Can we assume they are continuous?

Did you try measuring how well your code performs?

(Sorry, started off with questions!)

You could try one of the two approaches (or both) described briefly below (there are probably many more):

1) Parse Trees.

You could create a Parse Tree. Then do what most compilers do to optimize expressions, constant folding, common subexpression elimination (which you could achieve by linking together the common expression subtrees and caching the result), etc.

Then you could use lazy evaluation techniques to avoid whole subtrees. For instance if you have a tree

    *
   / \
  A   B

where A evaluates to 0, you could completely avoid evaluating B as you know the result is 0. With RPN you would lose out on the lazy evaluation.

2) Interpolation

Assuming your function is continuous, you could approximate your function to a high degree of accuracy using Polynomial Interpolation. This way you can do the complicated calculation of the function a few times (based on the degree of polynomial you choose), and then do fast polynomial calculations for the rest of the time.

To create the initial set of data, you could just use approach 1 or just stick to using your RPN, as you would only be generating a few values.

So if you use Interpolation, you could keep your RPN...

Hope that helps!

天煞孤星 2024-08-28 04:49:46

为什么要重新发明轮子?请改用快速脚本语言。
将 lua 之类的东西集成到你的代码中将花费很少的时间并且速度非常快。

您通常能够对表达式进行字节编译,这应该会导致代码运行得非常快,对于简单的一维图来说肯定足够快。

我推荐 lua,因为它速度快,并且比任何其他脚本语言更容易与 C/C++ 集成。另一个不错的选择是 python,但虽然它更为人所知,但我发现集成起来比较困难。

Why reinvent the wheel? Use a fast scripting language instead.
Integrating something like lua into your code will take very little time and be very fast.

You'll usually be able byte compile your expression, and that should result in code that runs very fast, certainly fast enough for simple 1D graphs.

I recommend lua as its fast, and integrates with C/C++ easier than any other scripting language. Another good options would be python, but while its better known I found it trickier to integrate.

旧情勿念 2024-08-28 04:49:46

为什么不保留解析树(我宽松地使用“树”,在您的情况下它是一个操作序列),并相应地标记输入变量? (例如,对于输入 x、y、z 等。用 0 注释“x”以表示第一个输入变量,用 1 注释“y”以表示第二个输入变量等)

这样您就可以解析表达式一次,保留解析树,接收输入数组,并应用解析树进行评估。

如果您担心评估步骤(相对于解析步骤)的性能方面,我认为除非您进行向量化(立即将解析树应用于输入向量),否则您不会做得更好或将操作硬编码为固定函数。

Why not keep around a parse tree (I use "tree" loosely, in your case it's a sequence of operations), and mark input variables accordingly? (e.g. for inputs x, y, z, etc. annotate "x" with 0 to signify the first input variable, "y" with 1 to signify the 2nd input variable, etc.)

That way you can parse the expression once, keep the parse tree, take in an array of inputs, and apply the parse tree to evaluate.

If you're worrying about the performance aspects of the evaluation step (vs. the parsing step), I don't think you'd do much better unless you get into vectorizing (applying your parse tree on a vector of inputs at once) or hard-coding the operations into a fixed function.

仙女山的月亮 2024-08-28 04:49:46

我所做的是使用分流算法来生成 RPN。然后,我将 RPN“编译”为标记化形式,可以重复执行(解释性),而无需重新解析表达式。

What I do is use the shunting algorithm to produce the RPN. I then "compile" the RPN into a tokenised form that can be executed (interpretively) repeatedly without re-parsing the expression.

因为看清所以看轻 2024-08-28 04:49:46

Michael Anderson 建议使用 Lua。如果你想尝试 Lua 来完成这个任务,请参阅我的 ae 库。

Michael Anderson suggested Lua. If you want to try Lua for just this task, see my ae library.

橘寄 2024-08-28 04:49:46

低效是指什么意义上的低效?有机器时间和程序员时间。对于特定复杂程度的运行速度是否有一个标准?完成作业并继续下一个作业是否更重要(完美主义者有时永远不会完成)?

对于每个输入值都必须执行所有这些步骤。是的,您可以使用启发式扫描操作列表并对其进行一些清理。是的,您可以将其中一些编译为汇编,而不是调用 +、* 等作为高级函数。您可以将向量化(使用值向量执行所有 +,然后执行所有 * 等)与一次针对一个值执行整个过程进行比较。但你需要吗?

我的意思是,如果您在 gnuplot 或 Mathematica 中绘制函数,您认为会发生什么?

Inefficient in what sense? There's machine time and programmer time. Is there a standard for how fast it needs to run with a particular level of complexity? Is it more important to finish the assignment and move on to the next one (perfectionists sometimes never finish)?

All those steps have to happen for each input value. Yes, you could have a heuristic that scans the list of operations and cleans it up a bit. Yes, you could compile some of it down to assembly instead of calling +, * etc. as high level functions. You can compare vectorization (doing all the +'s then all the *'s etc, with a vector of values) to doing the whole procedure for one value at a time. But do you need to?

I mean, what do you think happens if you plot a function in gnuplot or Mathematica?

_畞蕅 2024-08-28 04:49:46

您对 RPN 的简单解释应该可以正常工作,特别是因为它包含

  • 数学库函数,例如 cosexp^(pow ,涉及对数)

  • 符号表查找

希望您的符号表(其中包含像 x 这样的变量)简短而简单。

库函数很可能是最耗时的,所以除非你的解释器写得不好,否则这不会是一个问题。

但是,如果您确实需要速度,则可以将表达式转换为 C 代码,即时编译并将其链接到 dll 中,然后加载它(大约需要一秒钟)。再加上数学函数的记忆版本,可以为您带来最佳性能。

PS 对于解析,您的语法非常普通,因此一个简单的递归下降解析器(大约一页代码,O(n) 与分流码相同)应该可以正常工作。事实上,您可能只能在解析时计算结果(如果数学函数占用了大部分时间),而不必担心解析树、RPN 等任何东西。

Your simple interpretation of RPN should work just fine, especially since it contains

  • math library functions like cos, exp, and ^(pow, involving logs)

  • symbol table lookup

Hopefully, your symbol table (with variables like x in it) will be short and simple.

The library functions will most likely be your biggest time-takers, so unless your interpreter is poorly written, it will not be a problem.

If, however, you really gotta go for speed, you could translate the expression into C code, compile and link it into a dll on-the-fly and load it (takes about a second). That, plus memoized versions of the math functions, could give you the best performance.

P.S. For parsing, your syntax is pretty vanilla, so a simple recursive-descent parser (about a page of code, O(n) same as shunting-yard) should work just fine. In fact, you might just be able to compute the result as you parse (if math functions are taking most of the time), and not bother with parse trees, RPN, any of that stuff.

心是晴朗的。 2024-08-28 04:49:46

我认为这个基于 RPN 的库可以达到目的: http://expressionoasis.vedantatree.com/

我用过它与我的一个计算器项目一起使用,效果很好。它小而简单,但可扩展。

I think this RPN based library can serve the purpose: http://expressionoasis.vedantatree.com/

I used it with one of my calculator project and it works well. It is small and simple, but extensible.

随心而道 2024-08-28 04:49:46

一种优化是将堆栈替换为值数组,并将求值器实现为三地址机器< /a> 其中每个操作从两个(或一个)位置加载并保存到第三个位置。这可以使代码非常紧凑:

struct Op {
  enum {
    add, sub, mul, div,
    cos, sin, tan,
   //....
  } op;
  int a, b, d;
}

void go(Op* ops, int n, float* v) {
  for(int i = 0; i < n; i++) {
    switch(ops[i].op) {
      case add: v[op[i].d] = v[op[i].a] + v[op[i].b]; break;
      case sub: v[op[i].d] = v[op[i].a] - v[op[i].b]; break;
      case mul: v[op[i].d] = v[op[i].a] * v[op[i].b]; break;
      case div: v[op[i].d] = v[op[i].a] / v[op[i].b]; break;
      //...
    }
  }
}

从 RPN 到 3 地址的转换应该很容易,因为 3 地址是一种概括。

One optimization would be to replace the stack with an array of values and implement the evaluator as a three address mechine where each operation loads from two (or one) location and saves to a third. This can make for very tight code:

struct Op {
  enum {
    add, sub, mul, div,
    cos, sin, tan,
   //....
  } op;
  int a, b, d;
}

void go(Op* ops, int n, float* v) {
  for(int i = 0; i < n; i++) {
    switch(ops[i].op) {
      case add: v[op[i].d] = v[op[i].a] + v[op[i].b]; break;
      case sub: v[op[i].d] = v[op[i].a] - v[op[i].b]; break;
      case mul: v[op[i].d] = v[op[i].a] * v[op[i].b]; break;
      case div: v[op[i].d] = v[op[i].a] / v[op[i].b]; break;
      //...
    }
  }
}

The conversion from RPN to 3-address should be easy as 3-address is a generalization.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文