当前位置：文江博客话题详情

Erlang 中 Dijkstra 算法使用什么数据结构？

发布于 2024-11-24 23:41:36 字数 3925 浏览 4 评论 0原文

免责声明：作者是 Erlang 新手。

想象一下，我们有一个由 1M 个节点组成的图，每个节点有 0-4 个邻居（边从每个节点发散到这些邻居，所以图是有向且连通的）。

以下是我选择的数据结构：

为了存储图表，我使用有向图，它基于 ETS 表。这允许快速（O(1)）访问节点的邻居。

对于未访问节点列表，我使用gb_sets:take_smallest（节点已经排序，获取后同时删除）。

对于前驱列表，我使用 dict 结构，它允许通过以下方式存储前驱：{Node1,Node1_predecessor},{Node2,Node2_predecessor}。

对于访问节点的列表，我使用一个简单的列表。

问题：

当我尝试更新有向图结构和 Unvisited_nodes 结构中节点的权重时，代码变得非常难以阅读和维护。将一个“对象”与需要在两个数据结构中同时更新的“字段”保留在一起似乎不是正确的方法。这样做的正确方法是什么？
同样的问题是关于前辈名单的。我应该在哪里存储节点“对象”的前驱“字段”？也许在图中（有向图结构）？
也许我应该根据进程和消息而不是对象（节点和边）及其字段（权重）重新思考整个 Dijkstra 算法？

UPD：

以下是基于 Antonakos 的建议的代码：

dijkstra(Graph,Start_node_name) ->

    io:format("dijkstra/2: start~n"),

    Paths = dict:new(),
    io:format("dijkstra/2: initialized empty Paths~n"),

    Unvisited = gb_sets:new(),
    io:format("dijkstra/2: initialized empty Unvisited nodes priority queue~n"),

    Unvisited_nodes = gb_sets:insert({0,Start_node_name,root},Unvisited),
    io:format("dijkstra/2: Added start node ~w with the weight 0 to the Unvisited nodes: ~w~n", [Start_node_name, Unvisited_nodes]),

    Paths_updated = loop_through_nodes(Graph,Paths,Unvisited_nodes),
    io:format("dijkstra/2: Finished searching for shortest paths: ~w~n", [Paths_updated]).




loop_through_nodes(Graph,Paths,Unvisited_nodes) ->
    %% We need this condition to stop looping through the Unvisited nodes if it is empty
    case gb_sets:is_empty(Unvisited_nodes) of
        false -> 
            {{Current_weight,Current_name,Previous_node}, Unvisited_nodes_updated} = gb_sets:take_smallest(Unvisited_nodes),
            case dict:is_key(Current_name,Paths) of
                false ->
                    io:format("loop_through_nodes: Found a new smallest unvisited node ~w~n",[Current_name]),

                    Paths_updated = dict:store(Current_name,{Previous_node,Current_weight},Paths),
                    io:format("loop_through_nodes: Updated Paths: ~w~n",[Paths_updated]),

                    Out_edges = digraph:out_edges(Graph,Current_name),
                    io:format("loop_through_nodes: Ready to iterate through the out edges of node ~w: ~w~n",[Current_name,Out_edges]),

                    Unvisited_nodes_updated_2 = loop_through_edges(Graph,Out_edges,Paths_updated,Unvisited_nodes_updated,Current_weight),
                    io:format("loop_through_nodes: Looped through out edges of the node ~w and updated Unvisited nodes: ~w~n",[Current_name,Unvisited_nodes_updated_2]),

                    loop_through_nodes(Graph,Paths_updated,Unvisited_nodes_updated_2);
                    true ->
                    loop_through_nodes(Graph,Paths,Unvisited_nodes_updated)
            end;
        true -> 
            Paths
    end.

loop_through_edges(Graph,[],Paths,Unvisited_nodes,Current_weight) ->
                    io:format("loop_through_edges: No more out edges ~n"),
    Unvisited_nodes;

loop_through_edges(Graph,Edges,Paths,Unvisited_nodes,Current_weight) ->
                    io:format("loop_through_edges: Start ~n"),
    [Current_edge|Rest_edges] = Edges,
    {Current_edge,Current_node,Neighbour_node,Edge_weight} = digraph:edge(Graph,Current_edge),
    case dict:is_key(Neighbour_node,Paths) of
        false ->
                    io:format("loop_through_edges: Inserting new neighbour node ~w into Unvisited nodes~n",[Current_node]),
            Unvisited_nodes_updated = gb_sets:insert({Current_weight+Edge_weight,Neighbour_node,Current_node},Unvisited_nodes),
                    io:format("loop_through_edges: The unvisited nodes are: ~w~n",[Unvisited_nodes_updated]),
            loop_through_edges(Graph,Rest_edges,Paths,Unvisited_nodes_updated,Current_weight);
        true ->
            loop_through_edges(Graph,Rest_edges,Paths,Unvisited_nodes,Current_weight)
    end.

原文

Disclaimer: The author is a newbie in Erlang.

Imagine, we have a graph consisting of 1M nodes, and each node has 0-4 neighbours (the edges are emanating from each node to those neighbours, so the graph is directed and connected).

Here is my choice of data structures:

To store the graph I use digraph, which is based on ETS tables. This allows fast (O(1)) access to the neighbours of a node.

For the list of unvisited nodes, I use gb_sets:take_smallest (the node is already sorted, and it is simultaneously deleted after fetching).

For the list of predecessors I use the dict structure, which allows to store the predecessors in the following way: {Node1,Node1_predecessor},{Node2,Node2_predecessor}.

For the list of visited nodes I use a simple list.

Problems:

The code becomes very hard to read and maintain when I try to update the weight of a node both in the digraph structure and in the Unvisited_nodes structure. It doesn't seem the right way to keep one 'object' with the 'fields' that need to be updated in two data structures simultaneously. What is the right way to do that?
The same question is about predecessors list. Where should I store the predecessor 'field' of a node 'object'? Maybe in the Graph (digraph structure)?
Maybe I should rethink the whole Dijkstra's algorithm in terms of processes and messages instead of objects (nodes and edges) and their fields(weights)?

UPD:

Here is the code based on the recommendations of Antonakos:

dijkstra(Graph,Start_node_name) ->

    io:format("dijkstra/2: start~n"),

    Paths = dict:new(),
    io:format("dijkstra/2: initialized empty Paths~n"),

    Unvisited = gb_sets:new(),
    io:format("dijkstra/2: initialized empty Unvisited nodes priority queue~n"),

    Unvisited_nodes = gb_sets:insert({0,Start_node_name,root},Unvisited),
    io:format("dijkstra/2: Added start node ~w with the weight 0 to the Unvisited nodes: ~w~n", [Start_node_name, Unvisited_nodes]),

    Paths_updated = loop_through_nodes(Graph,Paths,Unvisited_nodes),
    io:format("dijkstra/2: Finished searching for shortest paths: ~w~n", [Paths_updated]).




loop_through_nodes(Graph,Paths,Unvisited_nodes) ->
    %% We need this condition to stop looping through the Unvisited nodes if it is empty
    case gb_sets:is_empty(Unvisited_nodes) of
        false -> 
            {{Current_weight,Current_name,Previous_node}, Unvisited_nodes_updated} = gb_sets:take_smallest(Unvisited_nodes),
            case dict:is_key(Current_name,Paths) of
                false ->
                    io:format("loop_through_nodes: Found a new smallest unvisited node ~w~n",[Current_name]),

                    Paths_updated = dict:store(Current_name,{Previous_node,Current_weight},Paths),
                    io:format("loop_through_nodes: Updated Paths: ~w~n",[Paths_updated]),

                    Out_edges = digraph:out_edges(Graph,Current_name),
                    io:format("loop_through_nodes: Ready to iterate through the out edges of node ~w: ~w~n",[Current_name,Out_edges]),

                    Unvisited_nodes_updated_2 = loop_through_edges(Graph,Out_edges,Paths_updated,Unvisited_nodes_updated,Current_weight),
                    io:format("loop_through_nodes: Looped through out edges of the node ~w and updated Unvisited nodes: ~w~n",[Current_name,Unvisited_nodes_updated_2]),

                    loop_through_nodes(Graph,Paths_updated,Unvisited_nodes_updated_2);
                    true ->
                    loop_through_nodes(Graph,Paths,Unvisited_nodes_updated)
            end;
        true -> 
            Paths
    end.

loop_through_edges(Graph,[],Paths,Unvisited_nodes,Current_weight) ->
                    io:format("loop_through_edges: No more out edges ~n"),
    Unvisited_nodes;

loop_through_edges(Graph,Edges,Paths,Unvisited_nodes,Current_weight) ->
                    io:format("loop_through_edges: Start ~n"),
    [Current_edge|Rest_edges] = Edges,
    {Current_edge,Current_node,Neighbour_node,Edge_weight} = digraph:edge(Graph,Current_edge),
    case dict:is_key(Neighbour_node,Paths) of
        false ->
                    io:format("loop_through_edges: Inserting new neighbour node ~w into Unvisited nodes~n",[Current_node]),
            Unvisited_nodes_updated = gb_sets:insert({Current_weight+Edge_weight,Neighbour_node,Current_node},Unvisited_nodes),
                    io:format("loop_through_edges: The unvisited nodes are: ~w~n",[Unvisited_nodes_updated]),
            loop_through_edges(Graph,Rest_edges,Paths,Unvisited_nodes_updated,Current_weight);
        true ->
            loop_through_edges(Graph,Rest_edges,Paths,Unvisited_nodes,Current_weight)
    end.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

讽刺将军 2024-12-01 23:41:36

您选择的数据结构看起来不错，因此主要是在其中插入什么内容以及如何使它们保持最新的问题。我建议如下（我稍微更改了名称）：

Result：将 Node 映射到 的 dict {Cost, Prev}，其中 Cost 是到 Node 的路径的总成本，Prev 是路径上其前身.
Open：{Cost, Node, Prev} 的 gb_sets 结构。
具有 {EdgeCost, NextNode} 形式的边的图。

搜索结果由 Result 表示，并且图表根本不更新。没有多重处理或消息传递。

算法如下：

在 Open 中插入 {0, StartNode, Nil}，其中 Nil 是标记结束的内容
令{{Cost, Node, Prev}, Open1} = gb_sets:take_smallest(Open)。如果 Node 已经在 Result 中，则不执行任何操作；否则将 {Cost, Node, Prev} 添加到 Result，并为 Node 的每条边 {EdgeCost, NextNode} > 将 {Cost + EdgeCost, NextNode, Node} 添加到 Open1，但前提是 NextNode 尚未在 Result.继续Open1，直到集合为空。

Dijkstra 算法要求对 Open 集执行 decrease_key() 操作。由于 gb_sets 不容易支持这一点，因此我们使用了为 NextNode 插入元组的解决方法，即使 NextNode 可能存在于 中>已经打开。这就是为什么我们检查从 Open 中提取的节点是否已在 Result 中。

优先级队列使用的扩展讨论 将

优先级队列与 Dijkstra 算法结合使用的方法有多种。

在维基百科的标准版本中，仅插入节点v一次，但当 v 的成本和前身发生更改时，v 的位置会更新。

alt := dist[u] + dist_between(u, v)
if alt < dist[v]:
    dist[v] := alt
    previous[v] := u
    decrease-key v in Q

通常通过用add v to Q替换decrease-key v in Q来简化实现。这意味着 v 可以添加多次，因此算法必须检查从队列中提取的 u 是否尚未添加到结果中。

在我的版本中，我将上面的整个块替换为 add v to Q。因此，队列将包含更多条目，但由于它们总是按顺序提取，因此不会影响算法的正确性。如果您不需要这些额外的条目，您可以使用字典来跟踪每个节点的最低成本。

Your choice of data structures looks OK, so it is mostly a question of what to insert in them and how to keep them up to date. I'd suggest the following (I have changed the names a bit):

Result: A dict mapping Node to {Cost, Prev}, where Cost is the total cost of the path to Node and Prev is its predecessor on the path.
Open: A gb_sets structure of {Cost, Node, Prev}.
A graph with edges of the form {EdgeCost, NextNode}.

The result of the search is represented by Result and the graph isn't updated at all. There is no multiprocessing or message passing.

The algorithm goes as follows:

Insert {0, StartNode, Nil} in Open, where Nil is something that marks the end of the path.
Let {{Cost, Node, Prev}, Open1} = gb_sets:take_smallest(Open). If Node is already in Result then do nothing; otherwise add {Cost, Node, Prev} to Result, and for every edge {EdgeCost, NextNode} of Node add {Cost + EdgeCost, NextNode, Node} to Open1, but only if NextNode isn't already in Result. Continue with Open1 until the set is empty.

Dijkstra's algorithm asks for a decrease_key() operation on the Open set. Since this isn't readily supported by gb_sets we have used the workaround of inserting a tuple for NextNode even if NextNode might be present in Open already. That's why we check if the node extracted from Open is already in Result.

Extended discussion of the use of the priority queue

There are several ways of using a priority queue with Dijkstra's algorithm.

In the standard version of Wikipedia a node v is inserted only once but the position of v is updated when the cost and predecessor of v is changed.

alt := dist[u] + dist_between(u, v)
if alt < dist[v]:
    dist[v] := alt
    previous[v] := u
    decrease-key v in Q

Implementations often simplify by replacing decrease-key v in Q with add v to Q. This means that v can be added more than once, and the algorithm must therefore check that an u extracted from the queue hasn't already been added to the result.

In my version I am replacing the entire block above with add v to Q. The queue will therefore contain even more entries, but since they are always extracted in order it doesn't affect the correctness of the algorithm. If you don't want these extra entries, you can use a dictionary to keep track of the minimum cost for each node.

回复收藏 0 原文

~没有更多了~