当前位置：文江博客话题详情

mpi benchmarking

mpi_wtime（）混乱 - 看起来不正确

发布于 2025-02-02 20:34:05 字数 855 浏览 1 评论 0原文

我正在使用MPI_WTIME（）来测量并行应用的速度。

在4个内核上运行该应用程序在0.000061（大约需要30秒）

上运行50个内核，0.000308。（瞬时）

将工作负载乘以10倍，仍然在50个内核上，时间为0.000752。（大约2分钟IRL）

int main(int argc, char* argv[]) {

    ofstream file;
    file.open("primes.txt");
    file.close();

    MPI_Init(&argc, &argv);
    MPI_Status status;

    int rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    if (rank == 0)
        t1 = MPI_Wtime();

    int size;
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    if (rank == 0)
        Parent parent(size);
    else
        Child child(size, rank);

    if (rank == 0) {
        t2 = MPI_Wtime();
    }

    MPI_Finalize();

    if (rank == 0) 
        printf("Runtime = %f\n", t2 - t1);

}

父母包含一个管理孩子的循环。

这些数字没有任何意义。我在做什么错？

mpi_wtick（）为1E-9

I'm using MPI_Wtime() to measure the speed of a parallel application.

Running the application on 4 cores completes in 0.000061 (takes around 30 seconds)

Running on 50 cores, 0.000308. (instantaneous)

Multiplying the workload 10x, still on 50 cores, the time is 0.000752. (around a 2 minutes irl)

int main(int argc, char* argv[]) {

    ofstream file;
    file.open("primes.txt");
    file.close();

    MPI_Init(&argc, &argv);
    MPI_Status status;

    int rank;
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);

    if (rank == 0)
        t1 = MPI_Wtime();

    int size;
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    if (rank == 0)
        Parent parent(size);
    else
        Child child(size, rank);

    if (rank == 0) {
        t2 = MPI_Wtime();
    }

    MPI_Finalize();

    if (rank == 0) 
        printf("Runtime = %f\n", t2 - t1);

}

Parent contains a loop to manage children.

These numbers do not make any sense. What am I doing wrong?

MPI_Wtick() is 1e-9

收藏 0

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

评论（1）

梓梦 2025-02-09 20:34:05

感谢@giles Gouaillardet和@Victor Eijkhout回答。

将T1和T2移至本地并添加mpi_barrier在每个时间录制之前，我能够得到一个有意义的答案。

在4个内核上运行代码的结果为20.277840，听起来正确。

之前，相同的测试给出了0.000061的结果，这根本没有任何意义。

谢谢。

回复收藏 0 原文

~没有更多了~

关于作者

暂无简介

文章

评论

27 人气

关注发私信

相关话题

热门标签

操作系统程序设计 IT运维 Linux系统管理 JavaScript 服务器应用 solaris C/C++ PHP Shell BSD Vue.js aix Oracle Python HTML 系统管理 HTML5 CSS 前端

推荐作者

十二

文章 0 评论 0

飞烟轻若梦

文章 0 评论 0

OPleyuhuo

文章 0 评论 0

wxb0109

文章 0 评论 0

旧城空念

文章 0 评论 0

-小熊_

文章 0 评论 0

友情链接

我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的隐私政策了解更多相关信息。单击 接受 或继续使用网站，即表示您同意使用 Cookies 和您的相关数据。

原文