使用 pyparsing 进行部分评估

发布于 2024-08-15 14:36:13 字数 620 浏览 2 评论 0原文

我需要能够采用使用 OpenDocument 公式语法的公式,将其解析为 Python 可以理解的语法,但无需评估变量,然后能够通过更改变量的值来多次评估该公式。 公式可以是用户输入,因此 pyparsing 使我能够有效处理公式语法并清理用户输入。有许多可用的 pyparsing 很好的例子,但所有数学例子似乎都假设人们立即评估当前范围内的所有内容。

作为背景,我正在研究工业经济模型(生命周期评估或 LCA),其中这些公式代表过程之间的物质或能量交换量。可变量可以是多个参数的函数,例如地理位置。公式链和变量引用存储在有向无环图中,因此始终可以简单地计算公式。公式作为字符串存储在数据库中。 我的问题是:

  1. 是否可以解析一个公式,以便解析的评估也可以存储在数据库中(作为要评估的字符串或其他内容)?
  2. 这种方法还有其他选择吗?请记住,理想的解决方案是解析/写入一次,然后读取多次。例如,部分解析公式,然后使用 ast 模块,尽管我不知道这如何与数据库存储一起使用。
  3. 我可以查看与此类似的项目或库的任何示例吗?我不是一个程序员,只是一个学生,试图在业余时间制作一个开源 LCA 软件模型,同时完成他的论文。
  4. 这种方法是否太慢了?我希望能够进行大量的蒙特卡罗运行,每次运行可能涉及数万个公式评估(这是一个大数据库)。

I need to be able to take a formula that uses the OpenDocument formula syntax, parse it into syntax that Python can understand, but without evaluating the variables, and then be able to evaluate the formula many times with changing valuables for the variables.
Formulas can be user input, so pyparsing allows me to both effectively handle the formula syntax, and clean user input. There are a number of good examples of pyparsing available, but all the mathematical ones seem to assume that one evaluates everything in the current scope immediately.

For context, I am working with a model of the industrial economy (life cycle assessment, or LCA), where these formulas represent the amount of material or energy exchanges between processes. The variable amount can be a function of several parameters, such as geographical location. THe chain of formula and variable references are stored in a directed acyclic graph, so that formulas can always be simply evaluated. Formulas are stored as strings in a database.
My questions are:

  1. Is it possible to parse a formula such that the parsed evaluation can also be stored in the database (as a string to be evaled, or something else)?
  2. Are there alternatives to this approach? Bear in mind that the ideal solution is to parse/write once, and read many times. For example, partially parsing the formula, and then using the ast module, although I don't know how this could work with database storage.
  3. Any examples of a project or library similar to this that I could look over? I am not a programmer, just a student trying to finish his thesis while making an open-source LCA software model in my spare time.
  4. Is this approach too slow? I would like to be able to do substantial Monte Carlo runs, where each run could involve tens of thousands of formula evaluations (it is a big database).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

情深已缘浅 2024-08-22 14:36:13

1)是的,可以对解析表达式的结果进行pickle,并将其保存到数据库中。然后您可以只获取并取消pickle表达式,而不是再次重新解析原始表达式。

2) 您可以仅使用编译和 eval 内置函数对此进行快速和肮脏的传递,如以下交互式会话所示:

>>> y = compile("m*x+b","","eval")
>>> m = 100
>>> x = 5
>>> b = 1
>>> eval(y)
501

当然,这具有任何基于 eval 或 exec 的实现的安全陷阱,在不受信任或恶意的源字符串可能嵌入有害的系统调用。但如果这是你的论文并且完全在你的控制范围之内,那就不要做任何愚蠢的事情。

3) 您可以在 pyparsing wiki 的示例页面获取将表达式解析为“可评估”数据结构的在线示例。查看 simpleBool.pyevalArith.py 特别是。如果您感觉很兴奋,请订购2008 年 5 月号 Python 杂志,其中有我的文章“使用 Pyparsing 编写简单的解释器/编译器”,其中更详细地描述了所使用的方法,以及如何对解析结果进行 pickle 和 unpickle 的工作原理。

4)缓慢的部分将是解析,因此您在以某种中间且可重复评估的形式保存这些结果方面处于正确的轨道上。 eval 部分应该相当敏捷。第二个缓慢的部分是从数据库中获取这些腌制的结构。在 MC 运行期间,我将打包一个函数,该函数获取表达式的选择参数、从数据库中获取、取消并返回可计算的表达式。然后,一旦您完成此工作,请使用 memoize 装饰器来缓存这些查询结果对,以便任何给定的表达式只需要获取/取消腌制一次。

祝你论文顺利!

1) Yes, it is possible to pickle the results from parsing your expression, and save that to a database. Then you can just fetch and unpickle the expression, rather than reparse the original again.

2) You can do a quick-and-dirty pass at this just using the compile and eval built-ins, as in the following interactive session:

>>> y = compile("m*x+b","","eval")
>>> m = 100
>>> x = 5
>>> b = 1
>>> eval(y)
501

Of course, this has the security pitfalls of any eval- or exec-based implementation, in that untrusted or malicious source strings can embed harmful system calls. But if this is your thesis and entirely within your scope of control, just don't do anything foolish.

3) You can get an online example of parsing an expression into a "evaluatable" data structure at the pyparsing wiki's Examples page. Check out simpleBool.py and evalArith.py especially. If you're feeling flush, order a back issue of the May,2008 issue of Python magazine, which has my article "Writing a Simple Interpreter/Compiler with Pyparsing" with a more detailed description of the methods used, plus a description of how pickling and unpickling the parsed results works.

4) The slow part will be the parsing, so you are on the right track in preserving these results in some intermediate and repeatably-evaluatable form. The eval part should be fairly snappy. The second slow part will be in fetching these pickled structures from your database. During your MC run, I would package a single function that takes the selection parameters for an expression, fetches from the database, and unpickles and returns the evaluatable expression. Then once you have this working, use a memoize decorator to cache these query-results pairs, so that any given expression only needs to be fetched/unpickled once.

Good luck with your thesis!

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文