链接器性能与交换空间相关吗?
有时,用一个使用大块静态内存的小 C 程序来模拟一些东西是很方便的。我注意到在更改为 Fedora 15 后,该程序花了很长时间 编译。我们谈论的是 30 秒与 0.1 秒。更奇怪的是ld( 链接器)正在最大化 CPU 并慢慢开始吃掉所有可用的 记忆。经过一番摆弄后我成功了 找到这个新问题和我的交换大小之间的相关性 文件。下面是一个用于本讨论目的的示例程序:
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#define M 1000000
#define GIANT_SIZE (200*M)
size_t g_arr[GIANT_SIZE];
int main( int argc, char **argv){
int i;
for(i = 0; i<10; i++){
printf("This should be zero: %d\n",g_arr[i]);
}
exit(1);
}
该程序有一个巨大的数组,其声明的大小约为 200*8MB = 1.6GB 静态内存。编译这个程序需要一个 时间过长:
[me@bleh]$ time gcc HugeTest.c
real 0m12.954s
user 0m6.995s
sys 0m3.890s
[me@bleh]$
13 秒,对于一个 ~ 13 行 C 程序!?那是不对的。关键数字是 静态内存空间的大小。一旦它大于 总交换空间,它再次开始快速编译。例如,我 有 5.3GB 交换空间,因此将 GIANT_SIZE 更改为 (1000*M) 给出 下一次:
[me@bleh]$ time gcc HugeTest.c
real 0m0.087s
user 0m0.026s
sys 0m0.027s
啊,那还差不多吧!为了进一步说服我自己(和你自己,如果 你正在家里尝试这个)交换空间确实很神奇 数字,我尝试将可用交换空间更改为真正巨大的 19GB并尝试再次编译(1000*M)版本:
[me@bleh]$ ls -ali /extraswap
5986 -rw-r--r-- 1 root root 14680064000 Jul 26 15:01 /extraswap
[me@bleh]$ sudo swapon /extraswap
[me@bleh]$ time gcc HugeTest.c
real 4m28.089s
user 0m0.016s
sys 0m0.010s
4.5分钟后甚至没有完成!
显然链接器在这里做错了什么,但我不知道怎么办 除了重写程序或乱搞之外,要解决这个问题 与交换空间。我很想知道是否有解决方案,或者我是否已经 偶然发现了一些神秘的错误。
顺便说一下,这些程序都可以正确编译和运行,独立于所有交换业务。
作为参考,这里有一些可能相关的信息:
[]$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 27027
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
[]$ uname -r
2.6.40.6-0.fc15.x86_64
[]$ ld --version
GNU ld version 2.21.51.0.6-6.fc15 20110118
Copyright 2011 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) a later version.
This program has absolutely no warranty.
[]$ gcc --version
gcc (GCC) 4.6.1 20110908 (Red Hat 4.6.1-9)
Copyright (C) 2011 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
[]$ cat /proc/meminfo
MemTotal: 3478272 kB
MemFree: 1749388 kB
Buffers: 16680 kB
Cached: 212028 kB
SwapCached: 368056 kB
Active: 489688 kB
Inactive: 942820 kB
Active(anon): 401340 kB
Inactive(anon): 803436 kB
Active(file): 88348 kB
Inactive(file): 139384 kB
Unevictable: 32 kB
Mlocked: 32 kB
SwapTotal: 19906552 kB
SwapFree: 17505120 kB
Dirty: 172 kB
Writeback: 0 kB
AnonPages: 914972 kB
Mapped: 60916 kB
Shmem: 1008 kB
Slab: 55248 kB
SReclaimable: 26720 kB
SUnreclaim: 28528 kB
KernelStack: 3608 kB
PageTables: 63344 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 21645688 kB
Committed_AS: 11208980 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 139336 kB
VmallocChunk: 34359520516 kB
HardwareCorrupted: 0 kB
AnonHugePages: 151552 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 730752 kB
DirectMap2M: 2807808 kB
TL;DR:当 ac 程序的(大)静态内存略小于可用交换空间时,链接器需要很长时间才能链接该程序。然而,当静态空间略大于可用交换空间时,它会非常快。这是怎么回事!?
Sometimes it's handy to mock up something with a little C program that uses a big chunk of static memory. I noticed after changing to Fedora 15 the program took a long time to
compile. We're talking 30s vs. 0.1s. Even more weird was that ld (the
linker) was maxing out the CPU and slowly started eating all available
memory. After some fiddling I managed
to find a correlation between this new problem and the size of my swap
file. Here's an example program for the purposes of this discussion:
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#define M 1000000
#define GIANT_SIZE (200*M)
size_t g_arr[GIANT_SIZE];
int main( int argc, char **argv){
int i;
for(i = 0; i<10; i++){
printf("This should be zero: %d\n",g_arr[i]);
}
exit(1);
}
This program has a giant array which has a declared size of about
200*8MB = 1.6GB of static memory. Compiling this program takes an
inordinate amount of time:
[me@bleh]$ time gcc HugeTest.c
real 0m12.954s
user 0m6.995s
sys 0m3.890s
[me@bleh]$
13s For a ~13 line C program!? That's not right. The key number is the
size of the static memory space. As soon as it is larger than the
total swap space, it starts to compile quickly again. For example, I
have 5.3GB of swap space, so changing GIANT_SIZE to (1000*M) gives the
following time:
[me@bleh]$ time gcc HugeTest.c
real 0m0.087s
user 0m0.026s
sys 0m0.027s
Ah, that's more like it! To further convince myself (and yourself, if
you're trying this at home) that swap space was indeed the magic
number, I tried changing the available swap space to a truly massive
19GB and trying to compile the (1000*M) version again:
[me@bleh]$ ls -ali /extraswap
5986 -rw-r--r-- 1 root root 14680064000 Jul 26 15:01 /extraswap
[me@bleh]$ sudo swapon /extraswap
[me@bleh]$ time gcc HugeTest.c
real 4m28.089s
user 0m0.016s
sys 0m0.010s
It didn't even complete after 4.5 minutes!
Clearly the linker is doing something wrong here, but I don't know how
to work around this other than rewriting the program or messing around
with swap space. I'd love to know if there's a solution, or if I've
stumbled upon some arcane bug.
By the way, the programs all compile and run correctly, independent of all the swap business.
For reference, here is some possibly relevant information:
[]$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 27027
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
[]$ uname -r
2.6.40.6-0.fc15.x86_64
[]$ ld --version
GNU ld version 2.21.51.0.6-6.fc15 20110118
Copyright 2011 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) a later version.
This program has absolutely no warranty.
[]$ gcc --version
gcc (GCC) 4.6.1 20110908 (Red Hat 4.6.1-9)
Copyright (C) 2011 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
[]$ cat /proc/meminfo
MemTotal: 3478272 kB
MemFree: 1749388 kB
Buffers: 16680 kB
Cached: 212028 kB
SwapCached: 368056 kB
Active: 489688 kB
Inactive: 942820 kB
Active(anon): 401340 kB
Inactive(anon): 803436 kB
Active(file): 88348 kB
Inactive(file): 139384 kB
Unevictable: 32 kB
Mlocked: 32 kB
SwapTotal: 19906552 kB
SwapFree: 17505120 kB
Dirty: 172 kB
Writeback: 0 kB
AnonPages: 914972 kB
Mapped: 60916 kB
Shmem: 1008 kB
Slab: 55248 kB
SReclaimable: 26720 kB
SUnreclaim: 28528 kB
KernelStack: 3608 kB
PageTables: 63344 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 21645688 kB
Committed_AS: 11208980 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 139336 kB
VmallocChunk: 34359520516 kB
HardwareCorrupted: 0 kB
AnonHugePages: 151552 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 730752 kB
DirectMap2M: 2807808 kB
TL;DR: When the (large) static memory of a c program is slightly less than the available swap space, the linker takes forever to link the program. However, it's quite snappy when the static space is slightly larger than the available swap space. What's up with that!?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我能够在 Ubuntu 10.10 系统(
GNU ld (GNU Binutils for Ubuntu) 2.20.51-system.20100908
)上重现这一点,我想我已经得到了你的答案。首先,一些方法论。在确认这种情况发生在我的小型虚拟机(512MB 内存,2GB 交换)中后,我决定最简单的事情就是 strace gcc,看看当一切都陷入困境时究竟发生了什么:
它阐明了以下内容:
正如我们可能怀疑的那样,ld 看起来实际上是在尝试匿名 mmap 该数组的整个静态内存空间(或者可能是整个程序,很难说,因为程序的其余部分是如此之小,可能全部都适合那个额外的 4096)。
所以这一切都很好,但是为什么当我们超过系统上的可用交换空间时它仍然有效呢?让我们再次运行
swapoff
并运行strace -f
...不出所料,ld 似乎做了上次尝试的相同操作,以 mmap 整个空间。但系统不再能够这样做,它失败了! ld 再次尝试,再次失败,然后 ld 做了一些意想不到的事情......它以更少的内存继续前进。
奇怪的是,我想我们最好看一下
ld
代码。糟糕,它没有执行显式的mmap
。这必须来自普通的旧malloc
内部。我们必须使用一些调试符号来构建 ld 来追踪这一点。不幸的是,当我构建 bin-utils 2.21.1 时,问题就消失了。也许它已在较新版本的 bin-utils 中修复?I am able to reproduce this on an Ubuntu 10.10 system (
GNU ld (GNU Binutils for Ubuntu) 2.20.51-system.20100908
), and I think I have your answer. First, some methodology.After confirming this happens to me in a small VM (512MB ram, 2GB swap), from here I decided the easiest thing to do would be to strace gcc and see what exactly was going on when everything went to hell:
It illuminated the following:
It would appear that, as we might have suspected, it looks like
ld
is actually trying to anonymouslymmap
the entire static memory space of this array (or possibly the entire program, it's hard to tell since the rest of the program is so small, it might all fit in that extra 4096).So that's all well and good, but why does it work when we exceed the available swap on the system? Let's turn
swapoff
and runstrace -f
again...Unsurprisingly, ld seems to do the same thing it tried last time, to mmap the entire space. but the system is no longer able to do that, it fails! ld tries again, and it fails again, then ld does something unexpected... it moves on with less memory.
Weird, I guess we'd better have a look at the
ld
code then. Drat, it doesn't do an explicitmmap
. This must be coming from inside of a plain oldmalloc
. We'll have to build ld with some debug symbols to track this down. Unfortunately, when I built bin-utils 2.21.1 the problem went away. Perhap it's been fixed in newer versions of bin-utils?我没有观察到这种行为(8Gb 桌面上的 Debian/Sid/AMD64、gcc 4.6.2、binutils gold ld (GNU Binutils for Debian 2.22) 1.11)。这是更改后的程序(使用
pmap
显示其内存映射)。下面是它的编译:
及其执行:
我相信使用 binutils Gold 链接器 安装最新的 GCC(例如 GCC 4.6)对于此类程序非常重要。
我没听说有任何交换。
I don't observe this behavior (with Debian/Sid/AMD64 on a 8Gb desktop, gcc 4.6.2, binutils gold ld (GNU Binutils for Debian 2.22) 1.11). Here is the changed program (displaying its memory map with
pmap
).Here is its compilation:
and its execution:
I believe that installing a recent GCC (e.g. a GCC 4.6) with a binutils Gold linker is significant for such programs.
I don't hear any swapping involved.
我折磨地测试了我的 OpenSuse 11.4(一周内升级到 12.1),
我有 4GiB 内存 + 2GiB 交换,没有注意到严重的速度变慢,系统有时可能会崩溃,但编译时间仍然很短。
频繁交换时最长为 6 秒。
在运行之间,我已经加载和卸载了 Virtualbox Box VM、Eclipse、大型 pdf 文件、mi firefox,仅使用了 800+ MiB。我没有超出限制,否则很多应用程序会被操作系统杀死。它偏向于杀死 Firefox..:-)
我也走到了极端的定义:
即使如此,也没有什么显着的变化。
编辑:
我在具有 512MiB RAM 和 1.5GiB 交换的虚拟机上使用 Fedora16 进行了重新测试,除了我的“最大压力版本”上的错误消息(其中为阵列分配了 20000 MB)之外,情况类似。该错误表明数组大小为负数。
opensuse 12.1 VM 中也会发生相同的响应。 Fedora 16 安装速度非常慢并且内存占用很大(在安装过程中我必须使用 800MiB,而 OpenSuse 512 MiB),我无法在 Fedora 上使用 swapoff,因为它使用了大量交换空间。我在 OpenSuse 12.1 和 .两者的内核、gcc 等版本基本相同。两者都使用 KDE 作为桌面环境进行库存安装,
我无法重现您的问题,也许是与 gcc 相关的问题。尝试下载 4.5 等旧版本,看看会发生什么
I tortured tested my OpenSuse 11.4 (going for 12.1 in a week)
I have 4GiB ram + 2GiB swap and did not notice serious slow down, the system might be trashing at times, but still the compile time was short.
The longest was 6 seconds while heavy swapping.
Between running I have loaded and unloaded Virtualbox Box VM's, Eclipse, large pdf files, mi firefox alone using 800+ MiB. I didi not go the limit, otherwise many Apps would be killed by the OS. It has a preference for killing Firefox.. :-)
I also went to the extreme defining:
and even then nothing change significantly.
Edit:
I re-tested using Fedora16 on a VM with 512MiB RAM and 1.5GiB swap, and things were similar except for an error message on my "maximum stress version" where 20000 megabytes were assigned to the array. The error say the array size was negative.
The same response happens in opensuse 12.1 VM. The Fedora 16 install seamed verry slow and memory hungry(during install I had to use 800MiB versus OpenSuse 512 MiB), I could not use swapoff on Fedora because it was using a lot of swap space. I had not sluggishness nor memory problems on OpenSuse 12.1 and . Both have essentially the same versions of kernel, gcc, etc. Both using stock installs with KDE as the Desktop environment
I could not reproduce you issues, Maybe is a gcc related issue. Try downloading an older version like 4.5 and see what happens