计算公式结果的最佳方法?

发布于 2024-07-13 07:38:55 字数 265 浏览 6 评论 0原文

我目前有一个应用程序,其中可以包含数百个用户定义的公式。 目前,我使用反向波兰表示法来执行计算(将值和变量推入堆栈,然后将它们从堆栈中弹出并进行评估)。 开始并行化此过程的最佳方式是什么? 我应该考虑函数式语言吗?

计算是在数字数组上执行的,因此例如一个简单的 A+B 实际上可能意味着 100 次加法。 我目前正在使用 Delphi,但这不是以后的要求。 我将使用最适合该工作的工具。 公式也可能相互依赖,例如,我们可能有一个公式 C=A+B 和第二个公式 D=C+A。

I currently have an application which can contain 100s of user defined formulae. Currently, I use reverse polish notation to perform the calculations (pushing values and variables on to a stack, then popping them off the stack and evaluating). What would be the best way to start parallelizing this process? Should I be looking at a functional language?

The calculations are performed on arrays of numbers so for example a simple A+B could actually mean 100s of additions. I'm currently using Delphi, but this is not a requirement going forward. I'll use the tool most suited to the job. Formulae may also be dependent on each other So we may have one formula C=A+B and a second one D=C+A for example.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

琉璃梦幻 2024-07-20 07:38:55

让我们假设您的公式(方程)不是循环的,否则您不能“​​仅仅”评估它们。 如果您有像 A = B + C 这样的矢量化方程,其中 A、B 和 C 是数组,那么我们在概念上将它们拆分为分量上的方程,这样,如果数组大小为 5,则该方程将拆分为

a1 = b1 + c1
a2 = b2 + c2
...
a5 = b5 + c5

现在假设这一点,您有一组关于简单量(无论是整数、有理数还是其他量)的方程。

如果你有两个方程 E 和 F,假设 F 取决于 E,如果 F 的右侧提到了 E 的左侧,例如

E: a = b + c
F: q = 2*a + y

现在要了解如何计算它,您始终可以使用随机迭代来解决这个问题(这只是解释中的一个中间步骤),遵循这个算法:

1 while (there is at least one equation which has not been computed yet)
2   select one such pending equation E so that:
3     for every equation D such that E depends_on D:
4       D has been already computed
5   calculate the left-hand side of E

无论您如何在 // 2 行上做出选择,这个过程都会以正确的答案终止。现在最酷的是它也很容易并行化。 您可以在任意数量的线程中运行它! 您需要的是一个并发安全队列,其中保存那些先决条件(方程所依赖的方程)已计算但尚未计算的方程。 每个线程一次从该队列中弹出(线程安全)一个方程,计算答案,然后检查现在是否有新方程,以便计算出它们的所有先决条件,然后添加这些方程(线程安全)到工作队列。 完毕。

Let's assume your formulae (equations) are not cyclic, as otherwise you cannot "just" evaluate them. If you have vectorized equations like A = B + C where A, B and C are arrays, let's conceptually split them into equations on the components, so that if the array size is 5, this equation is split into

a1 = b1 + c1
a2 = b2 + c2
...
a5 = b5 + c5

Now assuming this, you have a large set of equations on simple quantities (whether integer, rational or something else).

If you have two equations E and F, let's say that F depends_on E if the right-hand side of F mentions the left-hand side of E, for example

E: a = b + c
F: q = 2*a + y

Now to get towards how to calculate this, you could always use randomized iteration to solve this (this is just an intermediate step in the explanation), following this algorithm:

1 while (there is at least one equation which has not been computed yet)
2   select one such pending equation E so that:
3     for every equation D such that E depends_on D:
4       D has been already computed
5   calculate the left-hand side of E

This process terminates with the correct answer regardless on how you make your selections on line // 2. Now the cool thing is that it also parallelizes easily. You can run it in an arbitrary number of threads! What you need is a concurrency-safe queue which holds those equations whose prerequisites (those the equations depend on) have been computed but which have not been computed themselves yet. Every thread pops out (thread-safely) one equation from this queue at a time, calculates the answer, and then checks if there are now new equations so that all their prerequisites have been computed, and then adds those equations (thread-safely) to the work queue. Done.

错々过的事 2024-07-20 07:38:55

在不了解更多信息的情况下,我建议如果可能的话采用 SIMD 风格的方法。 也就是说,创建线程来计算单个数据集的所有公式。 尝试划分公式的计算以并行化它们不会产生太大的速度提升,因为能够将计算划分为适合线程的离散单元所需的逻辑将很难编写并且更难以正确执行,开销将被取消消除任何速度增益。 它还将很快因收益递减而遭受损失。

现在,如果您拥有一组应用于许多数据集的公式,那么并行化就会变得更容易并且可以更好地扩展。 每个线程对一组数据执行所有计算。 每个 CPU 核心创建一个线程,并设置其与每个核心的关联性。 每个线程实例化公式评估代码的一个实例。 创建一个加载单个数据集并向其传递空闲线程的主管。 如果没有空闲线程,则等待第一个线程完成其数据处理。 当所有数据集处理完毕并且所有线程都完成后,然后退出。 使用此方法时,线程数多于 CPU 上的内核数并没有任何优势,因为线程切换速度很慢,并且会对整体速度产生负面影响。

如果您只有一个数据集,那么这不是一项简单的任务。 它将需要解析评估树以获得不依赖于其他分支的分支,并将这些分支划分为在每个核心上运行并等待结果的单独线程。 然后,您会遇到同步数据和确保数据一致性的问题。

Without knowing more, I would suggest taking a SIMD style approach if possible. That is, create threads to compute all formulas for a single data set. Trying to divide the computation of formulas to parallelise them wouldn't yield much speed improvement as the logic required to be able to split up the computations into discrete units suitable for threading would be hard to write and harder to get right, the overhead would cancel out any speed gains. It would also suffer quickly from diminishing returns.

Now, if you've got a set of formulas that are applied to many sets of data then the parallelisation becomes easier and would scale better. Each thread does all computations for one set of data. Create one thread per CPU core and set its affinity to each core. Each thread instantiates one instance of the formula evaluation code. Create a supervisor which loads a single data set and passes it an idle thread. If no threads are idle, wait for the first thread to finish processing its data. When all data sets are processed and all threads have finished, then exit. Using this method, there's no advantage to having more threads than there are cores on the CPU as thread switching is slow and will have a negative effect on overall speed.

If you've only got one data set then it is not a trivial task. It would require parsing the evaluation tree for branches without dependencies on other branches and farming those branches to separate threads running on each core and waiting for the results. You then get problems synchronizing the data and ensuring data coherency.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文