针对某些图操作的最简单算法的建议
这个项目的截止日期很快就到了,我没有太多时间来处理剩下的事情。因此,我不是寻找最好的(可能更复杂/耗时)算法,而是寻找最简单的算法来实现图结构上的一些操作。
我需要执行的操作如下:
- 列出给定距离 X 的图网络中的所有用户
- 列出给定距离 X 的图网络中的所有用户和关系类型
- 计算图网络上 2 个用户之间的最短路径给定一种关系类型
- 计算图网络上 2 个用户之间的最大距离
- 计算图网络上最远连接的用户
关于我的图实现的一些说明:
- 边缘节点有 2 个属性,其中一个是
char< 类型/code> 和另一个
int
。它们分别代表关系类型和权重。 - 该图的顶点和边都是通过链表实现的。我的意思是,每个顶点都指向下一个顶点,每个顶点还指向不同链表的头,即该特定顶点的边。
我知道我需要做什么:
- 我不知道这是否是我上面所说的最简单的,但是对于两个用户之间的最短路径,我相信 Dijkstra 算法是人们似乎经常推荐的算法,所以我认为我同意了。
- 我一直在搜索和搜索,发现很难实现这个算法,有谁知道任何教程或易于理解的东西,以便我可以自己实现这个算法吗?如果可以的话,有C源代码示例,会有很大帮助。我看到很多带有数学符号的例子,但这让我更加困惑。
- 您认为如果我将图“转换”为邻接矩阵来表示链接权重和关系类型会有帮助吗?在其上执行算法而不是链表会更容易吗?我可以轻松地实现一个函数来在需要时进行转换。我这么说是因为我感觉在阅读了几页有关该主题的内容后会更容易,但我可能是错的。
- 我对其他 4 个操作没有任何想法,有什么建议吗?
The deadline for this project is closing in very quickly and I don't have much time to deal with what it's left. So, instead of looking for the best (and probably more complicated/time consuming) algorithms, I'm looking for the easiest algorithms to implement a few operations on a Graph structure.
The operations I'll need to do is as follows:
- List all users in the graph network given a distance X
- List all users in the graph network given a distance X and the type of relation
- Calculate the shortest path between 2 users on the graph network given a type of relation
- Calculate the maximum distance between 2 users on the graph network
- Calculate the most distant connected users on the graph network
A few notes about my Graph implementation:
- The edge node has 2 properties, one is of type
char
and anotherint
. They represent the type of relation and weight, respectively. - The Graph is implemented with linked lists, for both the vertices and edges. I mean, each vertex points to the next one and each vertex also points to the head of a different linked list, the edges for that specific vertex.
What I know about what I need to do:
- I don't know if this is the easiest as I said above, but for the shortest path between 2 users, I believe the Dijkstra algorithm is what people seem to recommend pretty often so I think I'm going with that.
- I've been searching and searching and I'm finding it hard to implement this algorithm, does anyone know of any tutorial or something easy to understand so I can implement this algorithm myself? If possible, with C source code examples, it would help a lot. I see many examples with math notations but that just confuses me even more.
- Do you think it would help if I "converted" the graph to an adjacency matrix to represent the links weight and relation type? Would it be easier to perform the algorithm on that instead of the linked lists? I could easily implement a function to do that conversion when needed. I'm saying this because I got the feeling it would be easier after reading a couple of pages about the subject, but I could be wrong.
- I don't have any ideas about the other 4 operations, suggestions?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
距什么距离
X
?从起始节点还是它们之间的距离X
开始?你能举个例子吗?这可能会也可能不会像进行 BF 搜索或运行 Dijkstra 那样简单。假设您从某个节点开始并想要列出与起始节点距离为 X 的所有节点,只需运行 BFS 从起始节点开始。当要向队列中插入新节点时,检查起始节点到要插入新节点的节点的距离是否+要插入新节点的节点到的边的权重新节点是 <=
X
。如果它严格较低,则插入新节点,如果相等,则仅打印新节点(并且仅当边缘权重也可以为 0 时才插入)。(见上文)。只需考虑 BFS 中的关系类型:如果父节点的类型与您尝试插入队列的节点的类型不同,则不要插入它。
该算法取决于许多因素:
既然你想要简单,最简单的是 Roy-Floyd 和 Dijkstra。
以下是 C 实现: Roy-Floyd 和 Dijkstra_1,Dijkstra_2。您可以通过
“<算法名称> c 实现”在 google 上找到很多信息
。编辑: 对于 18 000 个节点,Roy-Floyd 是不可能的,邻接矩阵也是如此。构建会花费太多时间和太多内存。最好的选择是对每个查询使用 Dijkstra 算法,但最好使用堆来实现 Dijkstra - 在我提供的链接中,使用堆来查找最小值。如果您对每个查询运行经典的 Dijkstra,这也可能需要很长时间。
另一种选择是在每个查询上使用 Bellman-Ford 算法,该算法每个查询将为您提供
O(Nodes*Edges)
运行时间。然而,如果您没有按照维基百科的要求实现它,那么这就是一个很大的高估。相反,请使用类似于 BFS 中使用的队列。每当节点更新其与源的距离时,将该节点插入回队列中。这在实践中会非常快,并且也适用于负权重。我建议您使用此方法或带堆的 Dijkstra,因为经典的 Dijkstra 可能在 18000 个节点上花费很长时间。最简单的方法是使用回溯:尝试所有可能性并保留找到的最长路径。 这是 NP 完全的,因此多项式解不存在。
如果你有 18000 个节点,这真的很糟糕,我不知道有任何算法(简单或其他)可以对这么多节点运行得相当快。考虑使用贪心算法对其进行近似。或者您的图表可能具有您可以利用的某些属性。例如,它是 DAG(有向无环图)吗?
意味着您想要找到图的直径。最简单的方法是找到每两个节点之间的距离(所有对最短路径 - 在每两个节点之间运行 Roy-Floyd 或 Dijkstra,并选择距离最大的两个)。
同样,对于节点和边的数量来说,这很难快速完成。恐怕您在最后两个问题上运气不好,除非您的图表具有可以利用的特殊属性。
不,除非您的应用程序针对超级计算机,否则邻接矩阵和 Roy-Floyd 是一个非常糟糕的主意。
A distance
X
from what? from a starting node or a distanceX
between themselves? Can you give an example? This may or may not be as simple as doing a BF search or running Dijkstra.Assuming you start at a certain node and want to list all nodes that have distances
X
to the starting node, just run BFS from the starting node. When you are about to insert a new node in the queue, check if the distance from the starting node to the node you want to insert the new node from + the weight of the edge from the node you want to insert the new node from to the new node is <=X
. If it's strictly lower, insert the new node and if it is equal just print the new node (and only insert it if you can also have 0 as an edge weight).See above. Just factor in the type of relation into the BFS: if the type of the parent is different than that of the node you are trying to insert into the queue, don't insert it.
The algorithm depends on a number of factors:
Since you want easy, the easiest are Roy-Floyd and Dijkstra's.
Here are C implementations: Roy-Floyd and Dijkstra_1, Dijkstra_2. You can find a lot on google with
"<algorithm name> c implementation"
.Edit: Roy-Floyd is out of the question for 18 000 nodes, as is an adjacency matrix. It would take way too much time to build and way too much memory. Your best bet is to either use Dijkstra's algorithm for each query, but preferably implementing Dijkstra using a heap - in the links I provided, use a heap to find the minimum. If you run the classical Dijkstra on each query, that could also take a very long time.
Another option is to use the Bellman-Ford algorithm on each query, which will give you
O(Nodes*Edges)
runtime per query. However, this is a big overestimate IF you don't implement it as Wikipedia tells you to. Instead, use a queue similar to the one used in BFS. Whenever a node updates its distance from the source, insert that node back into the queue. This will be very fast in practice, and will also work for negative weights. I suggest you use either this or the Dijkstra with heap, since classical Dijkstra might take a long time on 18 000 nodes.The simplest way is to use backtracking: try all possibilities and keep the longest path found. This is NP-complete, so polynomial solutions don't exist.
This is really bad if you have 18 000 nodes, I don't know any algorithm (simple or otherwise) that will work reasonably fast for so many nodes. Consider approximating it using greedy algorithms. Or maybe your graph has certain properties that you could take advantage of. For example, is it a DAG (Directed Acyclic Graph)?
Meaning you want to find the diameter of the graph. The simplest way to do this is to find the distances between each two nodes (all pairs shortest paths - either run Roy-Floyd or Dijkstra between each two nodes and pick the two with the maximum distance).
Again, this is very hard to do fast with your number of nodes and edges. I'm afraid you're out of luck on these last two questions, unless your graph has special properties that can be exploited.
No, adjacency matrix and Roy-Floyd are a very bad idea unless your application targets supercomputers.
这假设
O(E log V)
是可接受的运行时间,如果您在网上做某事,则可能不是,并且需要一些更高功率的机器。Djikstra 算法 对此很有用,仅供一次性使用。您可以通过对所有顶点进行线性扫描(或者更好的是排序和二分搜索)来保存结果以供将来使用。
可能与上面几乎相同 - 只需使用一些函数,如果它不是正确的关系,则权重将是无穷大。
与上面相同,本质上,只需尽早确定是否匹配这两个用户即可。 (或者,您可以“在中间相遇”,如果您在两个最短路径生成树上都找到某人,则提前终止)
最长路径 是一个 NP -完整的问题。
这是图的直径,您可以在 上阅读有关内容数学世界。
至于邻接列表与邻接矩阵问题,这取决于图的填充程度。另外,如果你想缓存结果,那么矩阵可能是最好的选择。
This assumes
O(E log V)
is an acceptable running time, if you're doing something online, this might not be, and it would require some higher powered machinery.Djikstra's algorithm is good for this, for one time use. You can save the result for future use, with a linear scan through all the vertices (or better yet, sort and binary search).
Might be nearly the same as above -- just use some function where the weight would be infinity if it is not of the correct relation.
Same as above, essentially, just determine early if you match the two users. (Alternatively, you can "meet in the middle", and terminate early if you find someone on both shortest path spanning tree)
Longest path is an NP-complete problem.
This is the diameter of the graph, which you can read about on Math World.
As for the adjacency list vs adjacency matrix question, it depends on how densely populated your graph is. Also, if you want to cache results, then the matrix might be the way to go.
计算两个节点之间最短路径的最简单算法是 Floyd-Warshall。它只是三层嵌套的 for 循环;就是这样。
它在
O(N^3)
中计算所有对最短路径,因此它可能会做比必要的更多的工作,并且如果N
则需要一段时间代码> 很大。The simplest algorithm to compute shortest path between two nodes is Floyd-Warshall. It's just triple-nested for loops; that's it.
It computes ALL-pairs shortest path in
O(N^3)
, so it may do more work than necessary, and will take a while ifN
is huge.