静态链接与动态链接
在某些情况下,是否有任何令人信服的性能原因来选择静态链接而不是动态链接,反之亦然?我听过或读过以下内容,但我对这个主题的了解还不够,无法保证其真实性。
1) 静态链接和动态链接之间的运行时性能差异通常可以忽略不计。
2) 如果使用使用配置文件数据来优化程序热路径的分析编译器,则 (1) 不成立,因为通过静态链接,编译器可以优化您的代码和库代码。通过动态链接,只能优化您的代码。如果大部分时间都花在运行库代码上,这可能会产生很大的影响。否则,(1) 仍然适用。
Are there any compelling performance reasons to choose static linking over dynamic linking or vice versa in certain situations? I've heard or read the following, but I don't know enough on the subject to vouch for its veracity.
1) The difference in runtime performance between static linking and dynamic linking is usually negligible.
2) (1) is not true if using a profiling compiler that uses profile data to optimize program hotpaths because with static linking, the compiler can optimize both your code and the library code. With dynamic linking only your code can be optimized. If most of the time is spent running library code, this can make a big difference. Otherwise, (1) still applies.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(16)
进行了一些编辑,以在评论和其他答案中包含非常相关的建议。我想指出的是,您打破这一点的方式很大程度上取决于您计划运行的环境。最小的嵌入式系统可能没有足够的资源来支持动态链接。稍大的小型系统可能很好地支持动态链接,因为它们的内存足够小,使得动态链接节省的 RAM 非常有吸引力。正如 Mark 所说,成熟的消费类 PC 拥有巨大的资源,而且您也许可以让便利问题驱动你对这个问题的思考。
解决性能和效率问题:视情况而定。
传统上,动态库需要某种粘合层,这通常意味着双重分派或函数寻址中的额外间接层,并且可能会花费一点速度(但函数调用时间实际上是运行时间的很大一部分吗???)。
但是,如果您运行的多个进程都多次调用同一个库,那么相对于使用静态链接,使用动态链接时最终可能会节省缓存行(从而赢得运行性能)。 (除非现代操作系统足够聪明,能够注意到静态链接的二进制文件中的相同段。似乎很难,有人知道吗?)
另一个问题:加载时间。您在某个时候支付装载费用。您何时支付此费用取决于操作系统的工作方式以及您使用的链接。也许您宁愿推迟付款,直到您知道需要它为止。
请注意,静态与动态链接传统上不是优化问题,因为它们都涉及单独编译为目标文件。然而,这不是必需的:原则上,编译器可以最初将“静态库”“编译”为摘要的 AST 形式,然后通过将这些 AST 添加到为主代码生成的 AST 来“链接”它们,从而实现全局优化。我使用的系统都没有这样做,所以我无法评论它的工作效果。
回答性能问题的方法始终是通过测试(并使用尽可能类似于部署环境的测试环境)。
Some edits to include the very relevant suggestions in the comments and in other answers. I'd like to note that the way you break on this depends a lot on what environment you plan to run in. Minimal embedded systems may not have enough resources to support dynamic linking. Slightly larger small systems may well support dynamic linking because their memory is small enough to make the RAM savings from dynamic linking very attractive. Full-blown consumer PCs have, as Mark notes, enormous resources, and you can probably let the convenience issues drive your thinking on this matter.
To address the performance and efficiency issues: it depends.
Classically, dynamic libraries require some kind of glue layer which often means double dispatch or an extra layer of indirection in function addressing and can cost a little speed (but is the function calling time actually a big part of your running time???).
However, if you are running multiple processes which all call the same library a lot, you can end up saving cache lines (and thus winning on running performance) when using dynamic linking relative to using static linking. (Unless modern OS's are smart enough to notice identical segments in statically linked binaries. Seems hard, does anyone know?)
Another issue: loading time. You pay loading costs at some point. When you pay this cost depends on how the OS works as well as what linking you use. Maybe you'd rather put off paying it until you know you need it.
Note that static-vs-dynamic linking is traditionally not an optimization issue, because they both involve separate compilation down to object files. However, this is not required: a compiler can in principle, "compile" "static libraries" to a digested AST form initially, and "link" them by adding those ASTs to the ones generated for the main code, thus empowering global optimization. None of the systems I use do this, so I can't comment on how well it works.
The way to answer performance questions is always by testing (and use a test environment as much like the deployment environment as possible).
1) 基于调用 DLL 函数总是使用额外的间接跳转这一事实。如今,这通常可以忽略不计。在 DLL 内部,i386 CPU 会产生更多开销,因为它们无法生成与位置无关的代码。在 amd64 上,跳转可以与程序计数器相关,因此这是一个巨大的改进。
2)这是正确的。通过分析指导的优化,您通常可以获得大约 10-15% 的性能提升。既然 CPU 速度已经达到极限,那么这样做可能是值得的。
我想补充一点:(3)链接器可以将函数排列在更高效的缓存分组中,从而最大限度地减少昂贵的缓存级别未命中。它还可能特别影响应用程序的启动时间(基于我在 Sun C++ 编译器中看到的结果)
并且不要忘记,对于 DLL,无法执行死代码消除。根据语言的不同,DLL 代码也可能不是最佳的。虚函数始终是虚拟的,因为编译器不知道客户端是否覆盖它。
由于这些原因,如果不需要 DLL,那么就使用静态编译。
编辑(通过用户下划线回答评论)
这是关于位置无关代码问题的好资源http://eli.thegreenplace.net/2011/11/03/position-independent-code-pic-in-shared-libraries/
如上所述x86 没有它们 AFAIK 除了 15 位跳转范围之外的任何其他东西,而不是无条件跳转和调用。这就是为什么超过 32K 的函数(来自生成器)始终是一个问题并且需要嵌入式蹦床。
但在流行的 x86 操作系统(如 Linux)上,您无需关心 .so/DLL 文件是否不是使用
gcc
开关-fpic
生成的(它强制使用间接跳转表)。因为如果不这样做,代码就会被修复,就像普通链接器会重新定位它一样。但是,在执行此操作时,它会使代码段不可共享,并且需要将代码从磁盘完整映射到内存并在使用之前对其进行全部处理(清空大部分缓存,命中 TLB)等。当这被认为很慢时。所以你不会再有任何好处了。
我不记得哪个操作系统(Solaris 或 FreeBSD)给我的 Unix 构建系统带来了问题,因为我只是没有这样做,并且想知道为什么它崩溃,直到我将 -fPIC 应用于 gcc< /代码>。
1) is based on the fact that calling a DLL function is always using an extra indirect jump. Today, this is usually negligible. Inside the DLL there is some more overhead on i386 CPU's, because they can't generate position independent code. On amd64, jumps can be relative to the program counter, so this is a huge improvement.
2) This is correct. With optimizations guided by profiling you can usually win about 10-15 percent performance. Now that CPU speed has reached its limits it might be worth doing it.
I would add: (3) the linker can arrange functions in a more cache efficient grouping, so that expensive cache level misses are minimised. It also might especially effect the startup time of applications (based on results i have seen with the Sun C++ compiler)
And don't forget that with DLLs no dead code elimination can be performed. Depending on the language, the DLL code might not be optimal either. Virtual functions are always virtual because the compiler doesn't know whether a client is overwriting it.
For these reasons, in case there is no real need for DLLs, then just use static compilation.
EDIT (to answer the comment, by user underscore)
Here is a good resource about the position independent code problem http://eli.thegreenplace.net/2011/11/03/position-independent-code-pic-in-shared-libraries/
As explained x86 does not have them AFAIK for anything else then 15 bit jump ranges and not for unconditional jumps and calls. That's why functions (from generators) having more then 32K have always been a problem and needed embedded trampolines.
But on popular x86 OS like Linux you do not need to care if the .so/DLL file is not generated with the
gcc
switch-fpic
(which enforces the use of the indirect jump tables). Because if you don't, the code is just fixed like a normal linker would relocate it. But while doing this it makes the code segment non shareable and it would need a full mapping of the code from disk into memory and touching it all before it can be used (emptying most of the caches, hitting TLBs) etc. There was a time when this was considered slow.So you would not have any benefit anymore.
I do not recall what OS (Solaris or FreeBSD) gave me problems with my Unix build system because I just wasn't doing this and wondered why it crashed until I applied
-fPIC
togcc
.动态链接是满足某些许可证要求(例如 LGPL)的唯一实用方法。
Dynamic linking is the only practical way to meet some license requirements such as the LGPL.
我同意 dnmckee 提到的观点,另外:
I agree with the points dnmckee mentions, plus:
进行静态链接构建的原因之一是验证可执行文件是否已完全关闭,即所有符号引用均已正确解析。
作为使用持续集成构建和测试的大型系统的一部分,夜间回归测试是使用可执行文件的静态链接版本运行的。有时,我们会看到符号无法解析,并且静态链接会失败,即使动态链接的可执行文件会成功链接。
当共享库中深层的符号名称拼写错误,因此无法静态链接时,通常会发生这种情况。无论使用深度优先还是广度优先求值,动态链接器都不会完全解析所有符号,因此您最终可以得到不具有完全闭包的动态链接可执行文件。
One reason to do a statically linked build is to verify that you have full closure for the executable, i.e. that all symbol references are resolved correctly.
As a part of a large system that was being built and tested using continuous integration, the nightly regression tests were run using a statically linked version of the executables. Occasionally, we would see that a symbol would not resolve and the static link would fail even though the dynamically linked executable would link successfully.
This was usually occurring when symbols that were deep seated within the shared libs had a misspelt name and so would not statically link. The dynamic linker does not completely resolve all symbols, irrespective of using depth-first or breadth-first evaluation, so you can finish up with a dynamically linked executable that does not have full closure.
1/ 我参与过对动态链接与静态链接进行基准测试的项目,并且差异没有确定得小到足以切换到动态链接(我没有参与测试,我只知道结论)
2/ 动态链接通常与 PIC(位置无关代码,不需要根据加载的地址进行修改的代码)相关。根据体系结构,PIC 可能会带来另一次减慢,但为了获得在两个可执行文件之间共享动态链接库的好处(如果操作系统使用加载地址的随机化作为安全措施,甚至同一可执行文件的两个进程)之间共享动态链接库,PIC 是必需的。我不确定所有操作系统都允许分离这两个概念,但 Solaris 和 Linux 允许分离,HP-UX 也允许 ISTR 分离。
3/ 我参与过其他项目,这些项目使用动态链接来实现“简单补丁”功能。但这个“简单的补丁”使小修复的分发变得更容易,而复杂修复的分发则成为版本控制的噩梦。我们经常最终不得不推动一切,并且必须在客户站点跟踪问题,因为错误的版本是令牌。
我的结论是,我使用了静态链接,但以下情况除外:
对于依赖于动态链接的插件之类的东西
当共享很重要时(多个库使用的大库) )
如果有人想使用“简单补丁”,我认为这些库必须像上面的大型库:它们必须几乎独立于已定义的 ABI,并且不得通过修复来更改。
1/ I've been on projects where dynamic linking vs static linking was benchmarked and the difference wasn't determined small enough to switch to dynamic linking (I wasn't part of the test, I just know the conclusion)
2/ Dynamic linking is often associated with PIC (Position Independent Code, code which doesn't need to be modified depending on the address at which it is loaded). Depending on the architecture PIC may bring another slowdown but is needed in order to get benefit of sharing a dynamically linked library between two executable (and even two process of the same executable if the OS use randomization of load address as a security measure). I'm not sure that all OS allow to separate the two concepts, but Solaris and Linux do and ISTR that HP-UX does as well.
3/ I've been on other projects which used dynamic linking for the "easy patch" feature. But this "easy patch" makes the distribution of small fix a little easier and of complicated one a versioning nightmare. We often ended up by having to push everything plus having to track problems at customer site because the wrong version was token.
My conclusion is that I'd used static linking excepted:
for things like plugins which depend on dynamic linking
when sharing is important (big libraries used by multiple processes at the same time like C/C++ runtime, GUI libraries, ... which often are managed independently and for which the ABI is strictly defined)
If one want to use the "easy patch", I'd argue that the libraries have to be managed like the big libraries above: they must be nearly independent with a defined ABI that must not to be changed by fixes.
静态链接与动态链接
静态链接
是编译时的一个过程,将链接的内容复制到主二进制文件中并成为单个二进制文件。缺点:
动态链接
是加载链接内容时在运行时的过程。该技术允许:ABI
稳定性[关于]缺点:
[iOS 静态框架与动态框架]
Static linking vs Dynamic linking
Static linking
is a process in compile time when a linked content is copied into the primary binary and becomes a single binary.Cons:
Dynamic linking
is a process in runtime when a linked content is loaded. This technic allows to:ABI
stability[About]Cons:
[iOS Static vs Dynamic framework]
动态链接的最佳示例是当库依赖于所使用的硬件时。在古代,C 数学库被决定是动态的,以便每个平台都可以使用所有处理器功能来优化它。
一个更好的例子可能是 OpenGL。 OpenGL 是 AMD 和 NVidia 以不同方式实现的 API。而且您无法在 AMD 卡上使用 NVidia 实现,因为硬件不同。因此,您无法将 OpenGL 静态链接到您的程序中。这里使用动态链接来让API针对所有平台进行优化。
Best example for dynamic linking is, when the library is dependent on the used hardware. In ancient times the C math library was decided to be dynamic, so that each platform can use all processor capabilities to optimize it.
An even better example might be OpenGL. OpenGl is an API that is implemented differently by AMD and NVidia. And you are not able to use an NVidia implementation on an AMD card, because the hardware is different. You cannot link OpenGL statically into your program, because of that. Dynamic linking is used here to let the API be optimized for all platforms.
这真的很简单。当您对源代码进行更改时,您想要等待 10 分钟还是 20 秒才能构建?二十秒是我能忍受的。除此之外,我要么拔出剑,要么开始思考如何使用单独的编译和链接将其带回舒适区。
It is pretty simple, really. When you make a change in your source code, do you want to wait 10 minutes for it to build or 20 seconds? Twenty seconds is all I can put up with. Beyond that, I either get out the sword or start thinking about how I can use separate compilation and linking to bring it back into the comfort zone.
动态链接需要操作系统额外的时间来查找动态库并加载它。通过静态链接,所有内容都放在一起,并且一次性加载到内存中。
另请参阅DLL Hell。在这种情况下,操作系统加载的 DLL 不是您的应用程序附带的 DLL,也不是您的应用程序期望的版本。
Dynamic linking requires extra time for the OS to find the dynamic library and load it. With static linking, everything is together and it is a one-shot load into memory.
Also, see DLL Hell. This is the scenario where the DLL that the OS loads is not the one that came with your application, or the version that your application expects.
在类 Unix 系统上,动态链接可能会让“root”很难使用共享库安装在偏僻位置的应用程序。这是因为动态链接器通常不会关注具有 root 权限的进程的 LD_LIBRARY_PATH 或其等效项。有时,静态链接可以挽救局面。
或者,安装过程必须找到库,但这可能会使软件的多个版本难以在计算机上共存。
On Unix-like systems, dynamic linking can make life difficult for 'root' to use an application with the shared libraries installed in out-of-the-way locations. This is because the dynamic linker generally won't pay attention to LD_LIBRARY_PATH or its equivalent for processes with root privileges. Sometimes, then, static linking saves the day.
Alternatively, the installation process has to locate the libraries, but that can make it difficult for multiple versions of the software to coexist on the machine.
另一个尚未讨论的问题是修复库中的错误。
使用静态链接,您不仅需要重建库,而且还需要重新链接和重新分发可执行文件。如果该库仅在一个可执行文件中使用,这可能不是问题。但需要重新链接和重新分发的可执行文件越多,痛苦就越大。
通过动态链接,您只需重建并重新分发动态库即可。
Another issue not yet discussed is fixing bugs in the library.
With static linking, you not only have to rebuild the library, but will have to relink and redestribute the executable. If the library is just used in one executable, this may not be an issue. But the more executables that need to be relinked and redistributed, the bigger the pain is.
With dynamic linking, you just rebuild and redistribute the dynamic library and you are done.
静态链接将程序所需的文件包含在单个可执行文件中。
动态链接是您认为常见的,它生成的可执行文件仍然需要 DLL 等位于同一目录中(或者 DLL 可能位于系统文件夹中)。
(DLL = 动态链接库)
动态链接的可执行文件编译速度更快,并且不占用大量资源。
Static linking includes the files that the program needs in a single executable file.
Dynamic linking is what you would consider the usual, it makes an executable that still requires DLLs and such to be in the same directory (or the DLLs could be in the system folder).
(DLL = dynamic link library)
Dynamically linked executables are compiled faster and aren't as resource-heavy.
在越来越多的系统中,极端水平的静态链接可以对应用程序和系统性能产生巨大的积极影响。
我指的是通常所说的“嵌入式系统”,其中许多系统现在越来越多地使用通用操作系统,并且这些系统用于可以想象的一切。
一个极其常见的例子是使用 Busybox 的 GNU/Linux 系统设备。我通过 NetBSD 将这一点发挥到了极致,构建了一个可启动的 i386(32 位)系统映像,包括内核及其根文件系统,后者包含单个静态链接(由
crunchgen
)二进制文件,其硬链接到本身包含 all 的所有程序(好吧最后统计为 274)个标准全功能系统程序(除了工具链之外的大多数),并且大小不到 20 兆字节(并且可能在只有 64MB 的系统中运行得非常舒服)内存(即使根文件系统未压缩且完全位于 RAM 中),尽管我无法找到这么小的一个来测试它)。在之前的文章中已经提到,静态链接二进制文件的启动时间更快(而且可能快很多),但这只是图片的一部分,尤其是当所有目标代码链接到同一个文件中,尤其是当操作系统支持直接从可执行文件进行代码的请求分页时。在这种理想的情况下,程序的启动时间实际上可以忽略不计,因为几乎所有代码页都已经在内存中并被 shell 使用(以及任何其他的 init )可能正在运行的后台进程),即使所请求的程序自启动以来从未运行过,因为可能只需要加载一页内存来满足程序的运行时要求。
但这还不是故事的全部。我通常还通过静态链接所有二进制文件来为我的完整开发系统构建和使用 NetBSD 操作系统安装。尽管这需要大量的磁盘空间(x86_64 总共约 6.6GB,包括工具链和 X11 静态链接)(特别是如果为所有程序保留完整的调试符号表,另外约 2.5GB),结果仍然是总体上运行速度更快,对于某些任务甚至比声称共享库代码页的典型动态链接系统使用更少的内存。磁盘很便宜(甚至是快速磁盘),并且用于缓存经常使用的磁盘文件的内存也相对便宜,但 CPU 周期实际上并不便宜,并且为每个启动的进程支付
ld.so
启动成本<每次启动都会花费数小时的 CPU 周期来处理需要启动许多进程的任务,特别是当一遍又一遍地使用相同的程序时,例如开发系统上的编译器。静态链接工具链程序可以将我的系统的整个操作系统多架构构建时间缩短小时。我还没有将工具链构建到我的单个 crunchgen 二进制文件中,但我怀疑当我这样做时,由于 CPU 缓存的胜利,将会节省更多小时的构建时间。There are a vast and increasing number of systems where an extreme level of static linking can have an enormous positive impact on applications and system performance.
I refer to what are often called "embedded systems", many of which are now increasingly using general-purpose operating systems, and these systems are used for everything imaginable.
An extremely common example are devices using GNU/Linux systems using Busybox. I've taken this to the extreme with NetBSD by building a bootable i386 (32-bit) system image that includes both a kernel and its root filesystem, the latter which contains a single static-linked (by
crunchgen
) binary with hard-links to all programs that itself contains all (well at last count 274) of the standard full-feature system programs (most except the toolchain), and it is less than 20 megabytes in size (and probably runs very comfortably in a system with only 64MB of memory (even with the root filesystem uncompressed and entirely in RAM), though I've been unable to find one so small to test it on).It has been mentioned in earlier posts that the start-up time of a static-linked binaries is faster (and it can be a lot faster), but that is only part of the picture, especially when all object code is linked into the same file, and even more especially when the operating system supports demand paging of code direct from the executable file. In this ideal scenario the startup time of programs is literally negligible since almost all pages of code will already be in memory and be in use by the shell (and and
init
any other background processes that might be running), even if the requested program has not ever been run since boot since perhaps only one page of memory need be loaded to fulfill the runtime requirements of the program.However that's still not the whole story. I also usually build and use the NetBSD operating system installs for my full development systems by static-linking all binaries. Even though this takes a tremendous amount more disk space (~6.6GB total for x86_64 with everything, including toolchain and X11 static-linked) (especially if one keeps full debug symbol tables available for all programs another ~2.5GB), the result still runs faster overall, and for some tasks even uses less memory than a typical dynamic-linked system that purports to share library code pages. Disk is cheap (even fast disk), and memory to cache frequently used disk files is also relatively cheap, but CPU cycles really are not, and paying the
ld.so
startup cost for every process that starts every time it starts will take hours and hours of CPU cycles away from tasks which require starting many processes, especially when the same programs are used over and over, such as compilers on a development system. Static-linked toolchain programs can reduce whole-OS multi-architecture build times for my systems by hours. I have yet to build the toolchain into my singlecrunchgen
'ed binary, but I suspect when I do there will be more hours of build time saved because of the win for the CPU cache.静态链接只为您提供一个 exe,为了进行更改,您需要重新编译整个程序。而在动态链接中,您只需要对 dll 进行更改,当您运行 exe 时,更改将在运行时生效。通过动态链接(例如:Windows)提供更新和错误修复更容易。
static linking gives you only a single exe, inorder to make a change you need to recompile your whole program. Whereas in dynamic linking you need to make change only to the dll and when you run your exe, the changes would be picked up at runtime.Its easier to provide updates and bug fixes by dynamic linking (eg: windows).
另一个考虑因素是您在库中实际使用的目标文件(翻译单元)数量与可用总数。如果一个库是由许多目标文件构建的,但您只使用其中几个的符号,这可能是支持静态链接的一个论据,因为您只链接静态链接时使用的对象(通常)并且不链接t 通常带有未使用的符号。如果您使用共享库,则该库包含所有翻译单元,并且可能比您想要或需要的大得多。
Another consideration is the number of object files (translation units) that you actually consume in a library vs the total number available. If a library is built from many object files, but you only use symbols from a few of them, this might be an argument for favoring static linking, since you only link the objects that you use when you static link (typically) and don't normally carry the unused symbols. If you go with a shared lib, that lib contains all translation units and could be much larger than what you want or need.