全对最短路径问题的最快实现？

发布于 2024-11-30 22:22:49 字数 552 浏览 3 评论 0原文

我有一个加权图 30k 个节点 160k 个边，没有负权重。我想计算从所有节点到其他节点的所有最短路径。我认为我不能假设任何特定的启发式方法来简化问题。

我尝试使用此 Dijkstra C 实现 http: //compprog.wordpress.com/2007/12/01/one-source-shortest-path-dijkstras-algorithm/，即针对单个最短路径问题，为我的所有 30 个节点调用函数 dijkstras()。正如你可以想象的那样，这需要很长时间。目前我没有时间自己编写和调试代码，我必须尽快计算这些路径并将它们存储在数据库中，所以我正在寻找另一个更快的解决方案可供使用，你有吗有什么建议吗？

我必须在最近配备 8GB 内存的 MacBook Pro 上运行它，我想找到一个不超过 24 小时即可完成计算的解决方案。

预先非常感谢！

尤金尼奥

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

剩一世无双 2024-12-07 22:22:49

我查看了您在评论中发布的 Dijkstra 算法链接，我相信这是您效率低下的根源。在内部 Dijkstra 循环中，它使用一种极其未经优化的方法来确定下一步要探索的节点（每一步对每个节点进行线性扫描）。有问题的代码有两个地方。第一个是此代码，它尝试找到要操作的下一个节点：

mini = -1;
for (i = 1; i <= n; ++i)
    if (!visited[i] && ((mini == -1) || (d[i] < d[mini])))
         mini = i;

因为此代码嵌套在访问每个节点的循环内，所以复杂性（如链接中所述）为 O(|V|²），其中 |V|是节点数。就你而言，因为 |V|是 30,000，则该循环总共将有九亿次迭代。这非常慢（如您所见），但没有理由必须做这么多工作。

另一个麻烦点在这里，它试图找到图中的哪条边应该用于减少其他节点的成本：

for (i = 1; i <= n; ++i)
   if (dist[mini][i])
       if (d[mini] + dist[mini][i] < d[i])
           d[i] = d[mini] + dist[mini][i];

这会扫描邻接矩阵中的整行，寻找要考虑的节点，这需要时间 O(n) ，无论有多少条传出边离开该节点。

虽然您可以尝试将此版本的 Dijkstra 算法修复为更优化的实现，但我认为正确的选择是丢弃此代码并找到 Dijkstra 算法的更好实现。例如，如果您使用通过二叉堆，你可以得到运行在O(|E| log |V|)中的Dijkstra算法。在您的例子中，该值刚刚超过 200 万，这比您当前的方法快大约 450 倍。这是一个巨大的差异，我愿意打赌，通过更好的 Dijkstra 实现，您最终将在比以前更短的时间内完成代码。

最重要的是，正如 Jacob Eggers 所指出的，您可能需要考虑并行运行所有 Dijkstra 搜索。该凸轮可以为您拥有的每个处理器带来额外的速度提升。结合上述（也是更关键的）修复，这可能会给您带来巨大的性能提升。

如果您计划在更密集的数据集（边数接近 |V|² / log |V|）上运行此算法，那么您可能需要考虑切换到 Floyd-Warshall 算法。每个节点运行一次 Dijkstra 算法（有时称为 Johnson 算法）需要时间 O(|V||E| log | V|) 时间，而使用 Floyd-Warshall 则需要 O(|V|³) 时间。然而，对于您提到的数据集，图形足够稀疏，运行多个 Dijkstra 实例应该没问题。

希望这有帮助！

I looked over the Dijkstra's algorithm link that you posted in the comments and I believe that it's the source of your inefficiency. Inside the inner Dijkstra's loop, it's using an extremely unoptimized approach to determine which node to explore next (a linear scan over every node at each step). The problematic code is in two spots. The first is this code, which tries to find the next node to operate on:

mini = -1;
for (i = 1; i <= n; ++i)
    if (!visited[i] && ((mini == -1) || (d[i] < d[mini])))
         mini = i;

Because this code is nested inside of a loop that visits every node, the complexity (as mentioned in the link) is O(|V|²), where |V| is the number of nodes. In your case, since |V| is 30,000, there will be nine hundred million iterations of this loop overall. This is painfully slow (as you've seen), but there's no reason to have to do this much work.

Another trouble spot is here, which tries to find which edge in the graph should be used to reduce the cost of other nodes:

for (i = 1; i <= n; ++i)
   if (dist[mini][i])
       if (d[mini] + dist[mini][i] < d[i])
           d[i] = d[mini] + dist[mini][i];

This scans over an entire row in the adjacency matrix looking for nodes to consider, which takes time O(n) irrespective of how many outgoing edges leave the node.

While you could try fixing up this version of Dijkstra's into a more optimized implementation, I think the correct option here is just to throw this code away and find a better implementation of Dijkstra's algorithm. For example, if you use the pseudocode from the Wikipedia article implemented with a binary heap, you can get Dijkstra's algorithm running in O(|E| log |V|). In your case, this value is just over two million, which is about 450 times faster than your current approach. That's a huge difference, and I'm willing to bet that with a better Dijkstra's implementation you'll end up getting the code completing in a substantially shorter time than before.

On top of this, you might want to consider running all the Dijkstra searches in parallel, as Jacob Eggers has pointed out. This cam get you an extra speed boost for each processor that you have. Combined with the above (and more critical) fix, this should probably give you a huge performance increase.

If you plan on running this algorithm on a much denser data set (one where the number of edges approaches |V|² / log |V|), then you may want to consider switching to the Floyd-Warshall algorithm. Running Dijkstra's algorithm once per node (sometimes called Johnson's algorithm) takes time O(|V||E| log |V|) time, while using Floyd-Warshall takes O(|V|³) time. However, for the data set you've mentioned, the graph is sufficiently sparse that running multiple Dijkstra's instances should be fine.

Hope this helps!

回复收藏 0 原文

巾帼英雄 2024-12-07 22:22:49

Floyd-Warshall 算法怎么样？

回复收藏 0 原文

ぃ双果 2024-12-07 22:22:49

你的图有什么特殊的结构吗？该图是平面的（或接近平面的）吗？

我建议不要尝试存储所有最短路径，相当密集的编码（30k^2“下一步去哪里”条目）将占用 7 GB 内存。

有什么应用？当您需要找到特定的最短路径时，您确定执行双向 Dijkstra（或 A*，如果您有启发式）不够快吗？

回复收藏 0 原文

童话 2024-12-07 22:22:49

如果您可以将算法修改为多线程，您也许可以在 24 小时内完成它。

第一个节点可能花费超过 1 分钟。但是，第 15,000 个节点应该只需要一半的时间，因为您已经计算了到所有先前节点的最短路径。

回复收藏 0 原文

老娘不死你永远是小三 2024-12-07 22:22:49

瓶颈可能是您使用存储路径的数据结构。如果您使用太多存储空间，您很快就会耗尽缓存和内存空间，从而导致快速算法运行非常缓慢，因为它会获得 100（缓存未命中）或 10000+（交换页）常量乘数。

因为您必须在数据库中存储路径，所以我怀疑这很容易成为瓶颈。最好首先尝试使用非常高效的存储模式生成进入内存的路径，例如每个顶点 N 位，其中 N == 每个顶点的最大边数。然后为每条边设置一个位，可用于生成最短路径之一。生成此路径信息后，您可以运行递归算法，将路径信息生成为适合数据库存储的格式。

当然最有可能的瓶颈仍然是数据库。您需要仔细考虑使用什么格式来存储信息，因为在 SQL 数据库中插入、搜索和修改大型数据集非常慢。如果数据库引擎设法将多个插入操作路由到单个磁盘写入操作，那么使用事务执行数据库操作可能能够减少磁盘写入开销。

最好将结果简单地存储在内存缓存中，并在不再需要解决方案时丢弃它们。如果您碰巧再次需要它们，则再次生成相同的结果。这意味着您只会在实际需要时按需生成路径。对于 Dijkstra 的单次最短路径运行，30k 节点和 160k 边的运行时间应明显低于一秒。

对于最短路径算法我总是选择C++。 C 实现也不应该简单，但 C++ 提供了 STL 容器的减少编码，可以在初始实现中使用这些容器，并且只有在基准测试和分析表明需要某些东西时才在以后实现优化的队列算法比 STL 提供的更好。

#include <queue>
#include "vertex.h"

class vertex;
class edge;

class searchnode {
    vertex *dst;
    unsigned long dist;
    public:
    searchnode(vertex *destination, unsigned long distance) :
        dst(dst),
        dist(distance)
    {
    }

    bool operator<(const searchnode &b) const {
        /* std::priority_queue stores largest value at top */
        return dist > b.dist;
    }

    vertex *dst() const { return dst; }

    unsigned long travelDistance() const { return dist; }
};


static void dijkstra(vertex *src, vertex *dst)
{
    std::priority_queue<searchnode> queue;

    searchnode start(src, 0);
    queue.push(start);

    while (!queue.empty()) {
        searchnode cur = queue.top();
        queue.pop();

        if (cur.travelDistance() >= cur.dst()->distance())
            continue;

        cur.dst()->setDistance(cur.travelDistance());
        edge *eiter;
        for (eiter = cur.dst()->begin(); eiter != cur.dst()->end(); eiter++) {
            unsigned nextDist = cur.dist() + eiter->cost();

            if (nextDist >= eiter->otherVertex())
                continue;

            either->otherVertex()->setDistance(nextdist + 1);

            searchnode toadd(eiter->othervertex(), nextDist);
            queue.push(toadd);
        }
    }
}

Bottleneck can be your data structure that you use storing paths. If you use too much storage you run out of cache and memory space very soon causing fast algorithm to run very slowly because it gains order of 100 (cache miss) or 10000+ (swapped pages) constant multiplier.

Because you have to store paths in database I suspect that might be easily a bottleneck. It is probably best to try to first generate paths into memory with very efficient storage mode like N bits per vertex where N == maximum number of edges per vertex. Then set a bit to for each edge that can be used to generate one of the shortest paths. After generating this path information you can run a recursive algorithm generating the path information to a format suitable for database storage.

Of course the most likely bottleneck is still the database. You want to think very carefully what format you use to store the information because insert, search and modifying large datasets in SQL database is very slow. Also using transaction to do database operations might be able to reduce the disk write overhead if database engine manages to path multiple insertions to a single disk write operation.

It can be even better to simple store results in memory cache and discard solutions when they are not actively needed any more. Then generate same results again if you happen to need them again. That means you would generate paths only on demand when you actually need them. Runtime for 30k nodes and 160k edges should be clearly below a second for single all shortest path run of Dijkstra.

For shortest path algorithms I have always chosen C++. There shouldn't be any reason why C implementation wouldn't be simple too but C++ offers reduced coding with STL containers that can be used in initial implementation and only later implement optimized queue algorithm if benchmarks and profiling shows that there is need to have something better than STL offers.

#include <queue>
#include "vertex.h"

class vertex;
class edge;

class searchnode {
    vertex *dst;
    unsigned long dist;
    public:
    searchnode(vertex *destination, unsigned long distance) :
        dst(dst),
        dist(distance)
    {
    }

    bool operator<(const searchnode &b) const {
        /* std::priority_queue stores largest value at top */
        return dist > b.dist;
    }

    vertex *dst() const { return dst; }

    unsigned long travelDistance() const { return dist; }
};


static void dijkstra(vertex *src, vertex *dst)
{
    std::priority_queue<searchnode> queue;

    searchnode start(src, 0);
    queue.push(start);

    while (!queue.empty()) {
        searchnode cur = queue.top();
        queue.pop();

        if (cur.travelDistance() >= cur.dst()->distance())
            continue;

        cur.dst()->setDistance(cur.travelDistance());
        edge *eiter;
        for (eiter = cur.dst()->begin(); eiter != cur.dst()->end(); eiter++) {
            unsigned nextDist = cur.dist() + eiter->cost();

            if (nextDist >= eiter->otherVertex())
                continue;

            either->otherVertex()->setDistance(nextdist + 1);

            searchnode toadd(eiter->othervertex(), nextDist);
            queue.push(toadd);
        }
    }
}

回复收藏 0 原文

~没有更多了~