提高图连通性计算的性能
我正在编写一个程序来生成图表并检查它是否已连接。下面是代码。这是一些解释:我在平面上随机位置生成了许多点。然后我连接节点,而不是仅基于邻近度。我的意思是说,一个节点更有可能连接到更近的节点,这是由我在代码中使用的随机变量(h_sq)和距离决定的。因此,我生成所有链接(对称,即,如果我可以与 j 对话,反之亦然),然后使用 BFS 检查图是否已连接。
我的问题是代码似乎工作正常。然而,当节点数量大于 ~2000 时,速度非常慢,并且我需要多次运行此函数以进行模拟。我什至尝试使用其他库来绘制图形,但性能是相同的。 有谁知道我怎样才能加快一切速度?
谢谢,
int Graph::gen_links() {
if( save == true ) { // in case I want to store the structure of the graph
links.clear();
links.resize(xy.size());
}
double h_sq, d;
vector< vector<luint> > neighbors(xy.size());
// generate links
double tmp = snr_lin / gamma_0_lin;
// xy is a std vector of pairs containing the nodes' locations
for(luint i = 0; i < xy.size(); i++) {
for(luint j = i+1; j < xy.size(); j++) {
// generate |h|^2
d = distance(i, j);
if( d < d_crit ) // for sim purposes
d = 1.0;
h_sq = pow(mrand.randNorm(0, 1), 2.0) + pow(mrand.randNorm(0, 1), 2.0);
if( h_sq * tmp >= pow(d, alpha) ) {
// there exists a link between i and j
neighbors[i].push_back(j);
neighbors[j].push_back(i);
// options
if( save == true )
links.push_back( make_pair(i, j) );
}
}
if( neighbors[i].empty() && save == false ) {
// graph not connected. since save=false i dont need to store the structure,
// hence I exit
connected = 0;
return 1;
}
}
// here I do BFS to check whether the graph is connected or not, using neighbors
// BFS code...
return 1;
}
更新: 主要问题似乎是内部 for 循环内的 Push_back 调用。在本例中,这是花费大部分时间的部分。我应该使用reserve()来提高效率吗?
I am writing a program to generate a graph and check whether it is connected or not. Below is the code. Here is some explanation: I generate a number of points on the plane at random locations. I then connect the nodes, NOT based on proximity only. By that I mean to say that a node is more likely to be connected to nodes that are closer, and this is determined by a random variable that I use in the code (h_sq) and the distance. Hence, I generate all links (symmetric, i.e., if i can talk to j the viceversa is also true) and then check with a BFS to see if the graph is connected.
My problem is that the code seems to be working properly. However, when the number of nodes becomes greater than ~2000 it is terribly slow, and I need to run this function many times for simulation purposes. I even tried to use other libraries for graphs but the performance is the same.
Does anybody know how could I possibly speed everything up?
Thanks,
int Graph::gen_links() {
if( save == true ) { // in case I want to store the structure of the graph
links.clear();
links.resize(xy.size());
}
double h_sq, d;
vector< vector<luint> > neighbors(xy.size());
// generate links
double tmp = snr_lin / gamma_0_lin;
// xy is a std vector of pairs containing the nodes' locations
for(luint i = 0; i < xy.size(); i++) {
for(luint j = i+1; j < xy.size(); j++) {
// generate |h|^2
d = distance(i, j);
if( d < d_crit ) // for sim purposes
d = 1.0;
h_sq = pow(mrand.randNorm(0, 1), 2.0) + pow(mrand.randNorm(0, 1), 2.0);
if( h_sq * tmp >= pow(d, alpha) ) {
// there exists a link between i and j
neighbors[i].push_back(j);
neighbors[j].push_back(i);
// options
if( save == true )
links.push_back( make_pair(i, j) );
}
}
if( neighbors[i].empty() && save == false ) {
// graph not connected. since save=false i dont need to store the structure,
// hence I exit
connected = 0;
return 1;
}
}
// here I do BFS to check whether the graph is connected or not, using neighbors
// BFS code...
return 1;
}
UPDATE:
the main problem seems to be the push_back calls within the inner for loops. It's the part that takes most of the time in this case. Shall I use reserve() to increase efficiency?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您确定缓慢是由生成引起的而不是由您的搜索算法引起的吗?
图的生成是 O(n^2) 并且你不能对其做太多事情。但是,如果至少某些实验的点位置是固定的,您显然可以使用内存来交换一些时间。
首先,所有节点对的距离和 pow(d, alpha) 可以预先计算并保存到内存中,这样您就不需要一次又一次地计算它们。 10000 个节点的额外内存成本对于 double 大约是 800mb,对于 float 大约是 400mb。
此外,如果我没记错的话,正态变量的平方和是卡方分布。如果准确性的话,也许您可以进行一些预先计算的表查找允许吗?
最后,如果距离超过某个值时两个节点连接的概率很小,那么您不需要 O(n^2) 并且可能只能计算那些距离小于某个限制的节点对?
Are you sure the slowness is caused by the generation but not by your search algorithm?
The graph generation is O(n^2) and you can't do too much to it. However, you can apparently use memory in exchange of some of the time if the point locations are fixed for at least some of the experiments.
First, distances of all node pairs, and pow(d, alpha) can be precomputed and saved into memory so that you don't need to compute them again and again. The extra memory cost for 10000 nodes will be about 800mb for double and 400mb for float..
In addition, sum of square of normal variable is chi-square distribution if I remember correctly.. Probably you can have some precomputed table lookup if the accuracy allowed?
At last, if the probability that two nodes will be connected are so small if the distance exceeds some value, then you don't need O(n^2) and probably you can only calculate those node pairs that have distance smaller than some limits?
作为第一步,您应该尝试对内部和外部向量使用保留。
如果这没有使性能达到您的期望,我相信这是因为内存分配仍在发生。
我在类似情况下使用过一个方便的类,llvm::SmallVector(在 Google 中找到它)。它提供了一个带有少量预分配项的向量,因此您可以将每个向量的分配数量减少一个。
当预分配空间中的项目用完时,它仍然可以增长。
所以:
1)检查运行期间向量中的平均项目数(我指的是内部向量和外部向量)
2) 放入 llvm::SmallVector 并预分配该大小(由于向量是在堆栈上分配的,因此您可能需要增加堆栈大小,或者如果可用堆栈内存受到限制,则需要减少预分配)。
SmallVector 的另一个好处是它具有与 std::vector 几乎相同的接口(可以很容易地代替它)
As a first step you should try to use reserve for both inner and outer vectors.
If this does not bring performance up to your expectations I believe this is because memory allocations that are still happening.
There is a handy class I've used in similar situations, llvm::SmallVector (find it in Google). It provides a vector with few pre-allocated items, so you can have decrease number of allocations by one per vector.
It still can grow when it is running out of items in pre-allocated space.
So:
1) Examine the number of items you have in your vectors on average during runs (I'm talking about both inner and outer vectors)
2) Put in llvm::SmallVector with a pre-allocation of such size (as vector is allocated on the stack you might need to increase stack size, or reduce pre-allocation if you are restricted on available stack memory).
Another good thing about SmallVector is that it has almost the same interface as std::vector (could be easily put instead of it)