对于 C++ 中的图问题,邻接表和邻接矩阵哪个更好?

发布于 2024-08-20 18:13:01 字数 46 浏览 10 评论 0 原文

对于 C++ 中的图问题,邻接表和邻接矩阵哪个更好? 各自的优点和缺点是什么?

What is better, adjacency lists or adjacency matrix, for graph problems in C++?
What are the advantages and disadvantages of each?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(11

绳情 2024-08-27 18:13:01

这取决于问题。

邻接矩阵

  • 使用 O(n^2) 内存,
  • 速度很快查找并检查是否存在特定边缘
    任意两个节点之间 O(1)
  • 遍历所有边很慢
  • 添加/删除节点很慢;复杂的操作 O(n^2)
  • 添加新边很快 O(1)

邻接List

  • 内存使用更多地取决于边的数量(而不是节点的数量),
    如果邻接矩阵稀疏,这可能会节省大量内存
  • 查找任意两个节点之间是否存在特定边
    比矩阵 O(k) 稍慢;其中 k 是邻居节点的数量
  • 迭代所有边的速度很快,因为您可以直接访问任何节点的邻居
  • 添加/删除节点的速度很快;比矩阵表示更容易
  • 添加新边 O(1)

It depends on the problem.

Adjacency Matrix

  • Uses O(n^2) memory
  • It is fast to lookup and check for presence or absence of a specific edge
    between any two nodes O(1)
  • It is slow to iterate over all edges
  • It is slow to add/delete a node; a complex operation O(n^2)
  • It is fast to add a new edge O(1)

Adjacency List

  • Memory usage depends more on the number of edges (and less on the number of nodes),
    which might save a lot of memory if the adjacency matrix is sparse
  • Finding the presence or absence of specific edge between any two nodes
    is slightly slower than with the matrix O(k); where k is the number of neighbors nodes
  • It is fast to iterate over all edges because you can access any node neighbors directly
  • It is fast to add/delete a node; easier than the matrix representation
  • It fast to add a new edge O(1)
桃酥萝莉 2024-08-27 18:13:01

这个答案不仅仅适用于 C++,因为提到的所有内容都是关于数据结构本身,而与语言无关。而且,我的答案是假设您知道邻接表和矩阵的基本结构。

内存

如果内存是您最关心的问题,您可以遵循以下公式创建一个允许循环的简单图:

邻接矩阵占用 n2/8 字节空间(每个条目一位)。

一个邻接表占用8e空间,其中e是边的数量(32位计算机)。

如果我们将图的密度定义为 d = e/n2(边数除以最大边数),我们可以找到“断点”,其中列表占用的内存超过矩阵:

8e>当 d > n2/8 时1/64

因此,对于这些数字(仍然是 32 位特定的),断点位于 1/64
如果密度 (e/n2) 大于 1/64,那么如果您想节省内存,则最好使用矩阵。

您可以在wikipedia(有关邻接矩阵的文章)和许多其他网站上阅读相关内容。

旁注:可以通过使用哈希表来提高邻接矩阵的空间效率,其中键是顶点对(仅限无向)。

迭代和查找

邻接列表是一种仅表示现有边的紧凑方式。然而,这是以特定边缘查找可能缓慢为代价的。由于每个列表与顶点的度数一样长,如果列表是无序的,则检查特定边的最坏情况查找时间可能会变为 O(n)。
然而,查找顶点的邻居变得微不足道,并且对于稀疏或小图,迭代邻接列表的成本可能可以忽略不计。

另一方面,邻接矩阵使用更多空间来提供恒定的查找时间。由于每个可能的条目都存在,因此您可以使用索引在恒定时间内检查边缘是否存在。然而,邻居查找需要 O(n),因为您需要检查所有可能的邻居。
明显的空间缺陷是对于稀疏图,添加了大量填充。有关这方面的更多信息,请参阅上面的内存讨论。

如果您仍然不确定使用什么:大多数现实世界的问题都会产生稀疏和/或大型图,这更适合邻接列表表示。它们可能看起来更难实现,但我向您保证事实并非如此,当您编写 BFS 或 DFS 并想要获取节点的所有邻居时,它们只需一行代码即可。但是,请注意,我并不是在一般情况下推广邻接列表。

This answer is not just for C++ since everything mentioned is about the data structures themselves, regardless of language. And, my answer is assuming that you know the basic structure of adjacency lists and matrices.

Memory

If memory is your primary concern you can follow this formula for a simple graph that allows loops:

An adjacency matrix occupies n2/8 byte space (one bit per entry).

An adjacency list occupies 8e space, where e is the number of edges (32bit computer).

If we define the density of the graph as d = e/n2 (number of edges divided by the maximum number of edges), we can find the "breakpoint" where a list takes up more memory than a matrix:

8e > n2/8 when d > 1/64

So with these numbers (still 32-bit specific) the breakpoint lands at 1/64.
If the density (e/n2) is bigger than 1/64, then a matrix is preferable if you want to save memory.

You can read about this at wikipedia (article on adjacency matrices) and a lot of other sites.

Side note: One can improve the space-efficiency of the adjacency matrix by using a hash table where the keys are pairs of vertices (undirected only).

Iteration and lookup

Adjacency lists are a compact way of representing only existing edges. However, this comes at the cost of possibly slow lookup of specific edges. Since each list is as long as the degree of a vertex the worst case lookup time of checking for a specific edge can become O(n), if the list is unordered.
However, looking up the neighbours of a vertex becomes trivial, and for a sparse or small graph the cost of iterating through the adjacency lists might be negligible.

Adjacency matrices on the other hand use more space in order to provide constant lookup time. Since every possible entry exists you can check for the existence of an edge in constant time using indexes. However, neighbour lookup takes O(n) since you need to check all possible neighbours.
The obvious space drawback is that for sparse graphs a lot of padding is added. See the memory discussion above for more information on this.

If you're still unsure what to use: Most real-world problems produce sparse and/or large graphs, which are better suited for adjacency list representations. They might seem harder to implement but I assure you they aren't, and when you write a BFS or DFS and want to fetch all neighbours of a node they're just one line of code away. However, note that I'm not promoting adjacency lists in general.

蓝海似她心 2024-08-27 18:13:01

好的,我已经编译了图上基本操作的时间和空间复杂度。
下图应该是不言自明的。
请注意,当我们期望图稠密时,邻接矩阵如何更好,而当我们期望图稀疏时,邻接列表如何更好。
我做了一些假设。问我复杂性(时间或空间)是否需要澄清。 (例如,对于稀疏图,我将 En 设为一个小常量,因为我假设添加新顶点只会添加一些边,因为我们希望即使在添加新顶点之后,图仍保持稀疏状态顶点。)

如果有任何错误,请告诉我。

在此处输入图像描述

Okay, I've compiled the Time and Space complexities of basic operations on graphs.
The image below should be self-explanatory.
Notice how Adjacency Matrix is preferable when we expect the graph to be dense, and how Adjacency List is preferable when we expect the graph to be sparse.
I've made some assumptions. Ask me if a complexity (Time or Space) needs clarification. (For example, For a sparse graph, I've taken En to be a small constant, as I've assumed that addition of a new vertex will add only a few edges, because we expect the graph to remain sparse even after adding that vertex.)

Please tell me if there are any mistakes.

enter image description here

蹲在坟头点根烟 2024-08-27 18:13:01

这取决于您要寻找什么。

使用邻接矩阵,您可以快速回答有关两个顶点之间的特定边是否属于图的问题,并且还可以快速插入和删除边。缺点是必须使用过多的空间,特别是对于具有许多顶点的图,这是非常低效的,尤其是在图稀疏的情况下。

另一方面,使用邻接表检查给定边是否在图中会更困难,因为您必须搜索适当的列表才能找到该边,但它们更节省空间。

一般来说,邻接表是大多数图应用程序的正确数据结构。

It depends on what you're looking for.

With adjacency matrices you can answer fast to questions regarding if a specific edge between two vertices belongs to the graph, and you can also have quick insertions and deletions of edges. The downside is that you have to use excessive space, especially for graphs with many vertices, which is very inefficient especially if your graph is sparse.

On the other hand, with adjacency lists it is harder to check whether a given edge is in a graph, because you have to search through the appropriate list to find the edge, but they are more space efficient.

Generally though, adjacency lists are the right data structure for most applications of graphs.

浮萍、无处依 2024-08-27 18:13:01

假设我们有一个具有 n 个节点和 m 个边的图,

示例图
输入图像描述这里

邻接矩阵:
我们正在创建一个具有 n 行和列的矩阵,因此在内存中它将占用与 n2 成比例的空间。检查名为 uv 的两个节点之间是否有边将花费 θ(1) 时间。例如,检查 (1, 2) 是否是一条边,代码如下所示:

if(matrix[1][2] == 1)

如果要识别所有边,则必须迭代矩阵,这将需要两个嵌套循环,并且需要 θ(n< sup>2)。 (您可以只使用矩阵的上三角部分来确定所有边,但它仍然是 θ(n2))

邻接列表:
我们正在创建一个列表,每个节点也指向另一个列表。您的列表将包含 n 个元素,每个元素将指向一个列表,该列表的项目数等于该节点的邻居数(查看图像以获得更好的可视化效果)。因此它将占用与n+m成正比的内存空间。检查 (u, v) 是否为边将花费 O(deg(u)) 时间,其中 deg(u) 等于 u 的邻居数量。因为最多,你必须迭代 u 指向的列表。识别所有边将需要 θ(n+m)。

示例图的邻接列表

在此处输入图像描述
您应该根据您的需要做出选择。
由于我的声誉,我无法放置矩阵图像,对此表示抱歉

Lets assume we have a graph which has n number of nodes and m number of edges,

Example graph
enter image description here

Adjacency Matrix:
We are creating a matrix that has n number of rows and columns so in memory it will take space that is proportional to n2. Checking if two nodes named as u and v has an edge between them will take Θ(1) time. For example checking for (1, 2) is an edge will look like as follows in the code:

if(matrix[1][2] == 1)

If you want to identify all edges, you have to iterate over matrix at this will require two nested loops and it will take Θ(n2). (You may just use the upper triangular part of the matrix to determine all edges but it will be again Θ(n2))

Adjacency List:
We are creating a list that each node also points to another list. Your list will have n elements and each element will point to a list that has number of items that is equal to number of neighbors of this node (look image for better visualization). So it will take space in memory that is proportional to n+m. Checking if (u, v) is an edge will take O(deg(u)) time in which deg(u) equals number of neighbors of u. Because at most, you have to iterate over the list that is pointed by the u. Identifying all edges will take Θ(n+m).

Adjacency list of example graph

enter image description here
You should make your choice according to your needs.
Because of my reputation I couldn't put image of matrix, sorry for that

相思碎 2024-08-27 18:13:01

如果您正在研究 C++ 中的图形分析,可能第一个开始的地方是 boost 图形库,它实现了包括 BFS 在内的多种算法。

编辑

上一个关于SO的问题可能会有所帮助:

如何创建ac-boost-无向图并遍历深度优先搜索h

If you are looking at graph analysis in C++ probably the first place to start would be the boost graph library, which implements a number of algorithms including BFS.

EDIT

This previous question on SO will probably help:

how-to-create-a-c-boost-undirected-graph-and-traverse-it-in-depth-first-search

你没皮卡萌 2024-08-27 18:13:01

最好用例子来回答这个问题。

Floyd-Warshall 为例。我们必须使用邻接矩阵,否则算法会渐近变慢。

或者如果它是一个包含 30,000 个顶点的密集图怎么办?那么邻接矩阵可能有意义,因为您将存储每对顶点 1 位,而不是每条边 16 位(邻接列表所需的最小值):即 107 MB,而不是 1.7 GB。

但对于 DFS、BFS(以及使用它的算法,如 Edmonds-Karp)、优先级优先搜索(Dijkstra、Prim、A*)等算法,邻接表与矩阵一样好。好吧,当图很稠密时,矩阵可能会有轻微的边缘,但只是通过一个不起眼的常数因子。 (多少?这是一个尝试的问题。)

This is best answered with examples.

Think of Floyd-Warshall for example. We have to use an adjacency matrix, or the algorithm will be asymptotically slower.

Or what if it's a dense graph on 30,000 vertices? Then an adjacency matrix might make sense, as you'll be storing 1 bit per pair of vertices, rather than the 16 bits per edge (the minimum that you would need for an adjacency list): that's 107 MB, rather than 1.7 GB.

But for algorithms like DFS, BFS (and those that use it, like Edmonds-Karp), Priority-first search (Dijkstra, Prim, A*), etc., an adjacency list is as good as a matrix. Well, a matrix might have a slight edge when the graph is dense, but only by an unremarkable constant factor. (How much? It's a matter of experimenting.)

笨死的猪 2024-08-27 18:13:01

添加到 keyer5053 关于内存使用的答案。

对于任何有向图,邻接矩阵(每边 1 位)会消耗 n^2 * (1) 位内存。

对于完整图,邻接列表(带有64位指针)消耗n * (n * 64) 位内存,不包括列表开销。

对于不完整的图,邻接列表消耗 0 位内存,不包括列表开销。


对于邻接列表,您可以使用以下公式来确定邻接矩阵达到最佳内存之前的最大边数 (e)。

edges = n^2 / s 确定最大边数,其中s 是平台的指针大小。

如果您的图是动态更新的,则可以通过 n / s 的平均边数(每个节点)来维持这种效率。


一些使用 64 位指针和动态图的示例(动态图在更改后有效地更新问题的解决方案,而不是每次更改后都从头开始重新计算。)

对于有向图,其中 n< /code> 为 300,使用邻接列表时每个节点的最佳边数为:

= 300 / 64
= 4

如果我们将其代入 keyer5053 的公式,d = e / n^2 (其中 e 是总边数),我们可以看到我们低于断点(1 / s):

d = (4 * 300) / (300 * 300)
d < 1/64
aka 0.0133 < 0.0156

但是,64 位的指针可能有点过大了。
如果您使用 16 位整数作为指针偏移量,我们可以在断点之前容纳最多 18 个边。

= 300 / 16
= 18

d = ((18 * 300) / (300^2))
d < 1/16
aka 0.06 < 0.0625

这些示例中的每一个都忽略了邻接列表本身的开销(对于向量和 64 位指针来说是64*2)。

To add to keyser5053's answer about memory usage.

For any directed graph, an adjacency matrix (at 1 bit per edge) consumes n^2 * (1) bits of memory.

For a complete graph, an adjacency list (with 64 bit pointers) consumes n * (n * 64) bits of memory, excluding list overhead.

For an incomplete graph, an adjacency list consumes 0 bits of memory, excluding list overhead.


For an adjacency list, you can use the follow formula to determine the maximum number of edges (e) before an adjacency matrix is optimal for memory.

edges = n^2 / s to determine the maximum number of edges, where s is the pointer size of the platform.

If you're graph is dynamically updating, you can maintain this efficiency with an average edge count (per node) of n / s.


Some examples with 64 bit pointers and dynamic graph (A dynamic graph updates the solution of a problem efficiently after changes, rather than recomputing it from scratch each time after a change has been made.)

For a directed graph, where n is 300, the optimal number of edges per node using an adjacency list is:

= 300 / 64
= 4

If we plug this into keyser5053's formula, d = e / n^2 (where e is the total edge count), we can see we are below the break point (1 / s):

d = (4 * 300) / (300 * 300)
d < 1/64
aka 0.0133 < 0.0156

However, 64 bits for a pointer can be overkill.
If you instead use 16bit integers as pointer offsets, we can fit up to 18 edges before breaking point.

= 300 / 16
= 18

d = ((18 * 300) / (300^2))
d < 1/16
aka 0.06 < 0.0625

Each of these examples ignore the overhead of the adjacency lists themselves (64*2 for a vector and 64 bit pointers).

如日中天 2024-08-27 18:13:01

根据邻接矩阵的实现,为了有效的实现,应该提前知道图的“n”。如果图表过于动态并且需要时不时地扩展矩阵,这也可以算作一个缺点吗?

Depending on the Adjacency Matrix implementation the 'n' of the graph should be known earlier for an efficient implementation. If the graph is too dynamic and requires expansion of the matrix every now and then that can also be counted as a downside?

野の 2024-08-27 18:13:01

如果您使用哈希表而不是邻接矩阵或列表,您将获得更好或相同的大 O 运行时间和所有操作的空间(检查边的时间为 O(1),获取所有相邻边的时间为O(度),等等)。

不过,运行时和空间都有一些恒定的因素开销(哈希表不像链表或数组查找那么快,并且需要相当多的额外空间来减少冲突)。

If you use a hash table instead of either adjacency matrix or list, you'll get better or same big-O run-time and space for all operations (checking for an edge is O(1), getting all adjacent edges is O(degree), etc.).

There's some constant factor overhead though for both run-time and space (hash table isn't as fast as linked list or array lookup, and takes a decent amount extra space to reduce collisions).

没有伤那来痛 2024-08-27 18:13:01

我只想谈谈克服常规邻接列表表示的权衡,因为其他答案已经涵盖了这些方面。

通过利用DictionaryHashSet数据结构,可以在摊销常数时间内通过EdgeExists查询来表示邻接列表中的图。这个想法是将顶点保存在字典中,对于每个顶点,我们保留一个哈希集,引用它具有边的其他顶点。

此实现中的一个小权衡是,它将具有 O(V + 2E) 的空间复杂度,而不是常规邻接列表中的 O(V + E),因为这里边被表示两次(因为每个顶点都有自己的哈希集)的边缘)。但是,使用此实现,AddVertexAddEdgeRemoveEdge 等操作可以在摊余时间 O(1) 内完成,RemoveVertex 除外,这将是 O(V) 摊销,就像具有数组索引查找字典的邻接矩阵一样。这意味着除了实现简单之外,邻接矩阵没有任何特定的优势。在这个邻接表实现中,我们可以节省稀疏图的空间,并且具有几乎相同的性能。

有关详细信息,请查看 Github C# 存储库中的下面的实现。请注意,对于加权图,它使用嵌套字典而不是字典哈希集组合来容纳权重值。类似地,对于有向图,in & 也有单独的哈希集。出边缘。

Advanced-Algorithms

注意:我相信使用惰性删除我们可以进一步优化RemoveVertex 操作到 O(1) 摊销,尽管我还没有测试过这个想法。例如,删除时只需在字典中将顶点标记为已删除,然后在其他操作期间延迟清除孤立边。

I am just going to touch on overcoming the trade-off of regular adjacency list representation, since other answers have covered those aspects.

It is possible to represent a graph in adjacency list with EdgeExists query in amortized constant time, by taking advantage of Dictionary and HashSet data structures. The idea is to keep vertices in a dictionary, and for each vertex, we keep a hash set referencing to other vertices it has edges with.

One minor trade-off in this implementation is that it will have space complexity O(V + 2E) instead of O(V + E) as in regular adjacency list, since edges are represented twice here (because each vertex have its own hash set of edges). But operations such as AddVertex, AddEdge, RemoveEdge can be done in amortized time O(1) with this implementation, except for RemoveVertex, which would be O(V) amortized like in adjacency matrix with an array index lookup dictionary. This would mean that other than implementation simplicity, adjacency matrix don't have any specific advantage. We can save space on sparse graph with almost the same performance in this adjacency list implementation.

Take a look at implementations below in Github C# repository for details. Note that for weighted graph it uses a nested dictionary instead of dictionary-hash set combination so as to accommodate weight value. Similarly for directed graph there is separate hash sets for in & out edges.

Advanced-Algorithms

Note: I believe using lazy deletion we can further optimize RemoveVertex operation to O(1) amortized, even though I haven't tested that idea. For example, upon deletion just mark the vertex as deleted in dictionary, and then lazily clear orphaned edges during other operations.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文