关于启动大型多线程编程项目的建议

发布于 2024-08-14 21:58:43 字数 480 浏览 9 评论 0原文

我的公司目前运行一个第三方模拟程序（自然灾害风险建模），该程序从磁盘中吸收千兆字节的数据，然后处理几天以产生结果。我很快就会被要求将其重写为多线程应用程序，以便它在几小时而不是几天内运行。我预计需要大约 6 个月的时间来完成转换，并将单独工作。

我们有一个 24 进程的盒子来运行这个。我将可以访问原始程序的源代码（我认为是用 C++ 编写的），但目前我对它的设计方式知之甚少。

我需要关于如何解决这个问题的建议。我是一位经验丰富的程序员（约 30 年，目前使用 C# 3.5），但没有多处理器/多线程经验。如果合适的话，我愿意并且渴望学习一门新语言。我正在寻找有关语言、学习资源、书籍、架构指南的建议。。

要求：Windows操作系统商业级编译器，提供大量支持和良好的学习资源。不需要花哨的 GUI - 它可能会从配置文件运行并将结果放入 SQL Server 数据库中。

编辑：当前的应用程序是 C++，但我几乎肯定不会使用该语言进行重写。我删除了某人添加的 C++ 标签。

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

清引 2024-08-21 21:58:43

数值过程模拟通常在单个离散问题网格上运行（例如，地球表面或气体和尘埃云），这通常排除简单的任务耕种或并发方法。这是因为划分在表示物理空间区域的一组处理器上的网格不是一组独立的任务。每个子网格边缘的网格单元需要根据存储在逻辑空间中相邻的其他处理器上的网格单元的值进行更新。

在高性能计算中，模拟通常是使用并行化 MPI 或 OpenMP。 MPI 是一个消息传递库，具有多种语言的绑定，包括 C、C++、Fortran、Python 和 C#。 OpenMP 是一个用于共享内存多处理的 API。一般来说，MPI 比 OpenMP 更难编码，更具侵入性，但也更灵活。 OpenMP 需要在处理器之间共享内存区域，因此不适合许多架构。混合方案也是可能的。

这种类型的编程有其特殊的挑战。以及竞赛条件、死锁，活锁 >，以及并发编程的所有其他乐趣，您需要考虑处理器网格的拓扑 - 您选择如何在物理处理器之间分割逻辑网格。这很重要，因为并行加速是处理器之间通信量的函数，这本身是分解网格的总边长的函数。随着您添加更多处理器，此表面积会增加，从而增加通信开销。增加粒度最终会变得令人望而却步。

另一个重要的考虑因素是可以并行化的代码比例。然后，阿姆达尔定律规定了理论上可达到的最大加速比。在开始编写任何代码之前，您应该能够对此进行估计。

这两个事实将共同限制您可以运行的处理器的最大数量。最佳位置可能比您想象的要低得多。

我推荐这本书高性能计算，如果你能拿到的话。特别是关于性能基准测试和调优的章节是无价的。

劳伦斯利弗莫尔国家实验室< /a>.

Numerical process simulations are typically run over a single discretised problem grid (for example, the surface of the Earth or clouds of gas and dust), which usually rules out simple task farming or concurrency approaches. This is because a grid divided over a set of processors representing an area of physical space is not a set of independent tasks. The grid cells at the edge of each subgrid need to be updated based on the values of grid cells stored on other processors, which are adjacent in logical space.

In high-performance computing, simulations are typically parallelised using either MPI or OpenMP. MPI is a message passing library with bindings for many languages, including C, C++, Fortran, Python, and C#. OpenMP is an API for shared-memory multiprocessing. In general, MPI is more difficult to code than OpenMP, and is much more invasive, but is also much more flexible. OpenMP requires a memory area shared between processors, so is not suited to many architectures. Hybrid schemes are also possible.

This type of programming has its own special challenges. As well as race conditions, deadlocks, livelocks, and all the other joys of concurrent programming, you need to consider the topology of your processor grid - how you choose to split your logical grid across your physical processors. This is important because your parallel speedup is a function of the amount of communication between your processors, which itself is a function of the total edge length of your decomposed grid. As you add more processors, this surface area increases, increasing the amount of communication overhead. Increasing the granularity will eventually become prohibitive.

The other important consideration is the proportion of the code which can be parallelised. Amdahl's law then dictates the maximum theoretically attainable speedup. You should be able to estimate this before you start writing any code.

Both of these facts will conspire to limit the maximum number of processors you can run on. The sweet spot may be considerably lower than you think.

I recommend the book High Performance Computing, if you can get hold of it. In particular, the chapter on performance benchmarking and tuning is priceless.

An excellent online overview of parallel computing, which covers the major issues, is this introduction from Lawerence Livermore National Laboratory.

关于启动大型多线程编程项目的建议

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（16）

关于作者

相关话题

热门标签

推荐作者

浪漫人生路

620vip

羞稚

走过海棠暮

你好刘可爱

陌若浮生

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。