使用大型库本质上会使代码变慢吗?

发布于 2024-08-20 18:22:33 字数 771 浏览 5 评论 0原文

我有一种心理抽搐,这让我不愿意使用大型库(例如 GLib 或 < a href="http://www.boost.org/" rel="noreferrer">Boost)在较低级语言(如 C 和 C++)中。在我心里,我认为:

嗯,这个图书馆有数千个 投入了工时,并且已经 由懂得更多的人创建 关于语言的知识比我以往任何时候都多。 他们的作者和粉丝说 图书馆快速可靠, 而且功能看起来确实 有用,它肯定会阻止我 来自(糟糕的)重新发明轮子。

但该死的,我永远不会使用 该库中的每个函数。它是 太大了,可能会变得臃肿 历年;这是另一个球 并链接我的程序需要拖动。

Torvalds 咆哮(尽管存在争议)也不能让我安心。

我的想法有任何依据吗,还是我只是不合理和/或无知?即使我只使用大型库的一两个功能,通过链接到该库我是否会产生运行时性能开销?

我确信这也取决于特定的库是什么,但我通常有兴趣了解大型库是否会在技术层面上固有地引入低效率。

当我没有技术知识来知道我是否正确时,我厌倦了对此的痴迷、抱怨和担忧。

请让我脱离痛苦!

I have a psychological tic which makes me reluctant to use large libraries (like GLib or Boost) in lower-level languages like C and C++. In my mind, I think:

Well, this library has thousands of
man hours put into it, and it's been
created by people who know a lot more
about the language than I ever will.
Their authors and fans say that
the libraries are fast and reliable,
and the functionality looks really
useful, and it will certainly stop me
from (badly) reinventing wheels.

But damn it, I'm never going to use
every function in that library. It's
too big and it's probably become bloated
over the years; it's another ball
and chain my program needs to drag around.

The Torvalds rant (controversial though it is) doesn't exactly put my heart at ease either.

Is there any basis to my thinking, or am I merely unreasonable and/or ignorant? Even if I only use one or two features of a large library, by linking to that library am I going to incur runtime performance overheads?

I'm sure it depends too on what the specific library is, but I'm generally interested in knowing whether large libraries will, at a technical level, inherently introduce inefficiencies.

I'm tired of obsessing and muttering and worrying about this, when I don't have the technical knowledge to know if I'm right or not.

Please put me out of my misery!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(17

堇年纸鸢 2024-08-27 18:22:33

即使我只使用大型库的一两个功能,通过链接到该库我是否会产生运行时性能开销?

一般来说,没有。

如果相关库没有大量与位置无关的代码,那么当动态链接器在请求时对库执行重定位时,将会产生启动成本。通常,这是程序启动的一部分。除此之外,不会对运行时性能产生影响。

链接器还擅长在构建时从静态链接库中删除“死代码”,因此您使用的任何静态库都将具有最小的大小开销。性能甚至不参与其中。

坦率地说,你担心的是错误的事情。

Even if I only use one or two features of a large library, by linking to that library am I going to incur runtime performance overheads?

In general, no.

If the library in question doesn't have a lot of position-independent code, then there will be a start-up cost while the dynamic linker performs relocations on the library when it's requested. Usually, that's part of the program's start-up. There is no run-time performance effect beyond that.

Linkers are also good at removing "dead code" from statically-linked libraries at build time, so any static libraries you use will have minimal size overhead. Performance doesn't even enter into it.

Frankly, you're worrying about the wrong things.

习惯成性 2024-08-27 18:22:33

我无法对 GLib 发表评论,但请记住,Boost 中的许多代码都是头文件,并且考虑到用户只需为他们使用的内容付费的 C++ 原则,这些库非常高效。有几个库需要您链接它们(正则表达式、文件系统),但它们是单独的库。使用 Boost,您不必链接到大型整体库,而只能链接到您使用的较小组件。

当然,另一个问题是——替代方案是什么?您想在需要时自己实现 Boost 中的功能吗?鉴于许多非常有能力的人已经研究过这段代码,并确保它可以在多种编译器上运行并且仍然高效,这可能并不是一项简单的任务。另外,至少在某种程度上,你正在重新发明轮子。恕我直言,你可以更高效地度过这段时间。

I can't comment on GLib, but keep in mind that a lot of the code in Boost is header-only and given the C++ principle of the user only paying for what they're using, the libraries are pretty efficient. There are several libraries that require you to link against them (regex, filesystem come to mind) but they're separate libraries. With Boost you do not link against a large monolithic library but only against the smaller components that you do use.

Of course, the other question is - what is the alternative? Do you want to implement the functionality that is in Boost yourself when you need it? Given that a lot of very competent people have worked on this code and ensured that it works across a multitude of compilers and still is efficient, this might not exactly be a simple undertaking. Plus you're reinventing the wheel, at least to a certain extent. IMHO you can spend this time more productively.

或十年 2024-08-27 18:22:33

Boost 不是一个大库。

它是许多小型图书馆的集合。它们中的大多数都非常小,包含在一个或两个标头中。使用 boost::noncopyable 不会将 boost::regexboost::thread 拖到您的代码中。他们是不同的图书馆。它们只是作为同一图书馆馆藏的一部分分发。但您只需为您使用的部分付费。

但一般来说,因为大型库确实存在,即使 Boost 不是其中之一:

我的想法有任何依据吗,还是我只是不合理和/或无知?即使我只使用大型库的一两个功能,通过链接到该库我是否会产生运行时性能开销?

没有任何依据,或多或少
你可以自己测试一下。

编写一个小型 C++ 程序并编译它。现在向其中添加一个新函数,该函数从未被调用,但已定义。再次编译程序。假设启用了优化,它会被链接器删除,因为它未被使用。因此,包含额外的未使用代码的成本为零。

当然也有例外。如果代码实例化任何全局对象,则这些对象可能不会被删除(这就是包含 iostream 标头会增加可执行文件大小的原因),但一般来说,您可以包含尽可能多的标头并链接到尽可能多的库只要您不使用任何添加的代码,它就不会影响程序的大小、性能或内存使用情况*。

另一个例外是,如果动态链接到 .dll 或 .so,则必须分发整个库,因此不能删除未使用的代码。但是静态编译到可执行文件中的库(作为静态库(.lib 或 .a)或仅作为包含的头文件)通常可以由链接器修剪,删除未使用的符号。

Boost isn't a big library.

It is a collection of many small libraries. Most of them are so small they're contained in a header or two. Using boost::noncopyable doesn't drag boost::regex or boost::thread into your code. They're different libraries. They're just distributed as part of the same library collection. But you only pay for the ones you use.

But speaking generally, because big libraries do exist, even if Boost isn't one of them:

Is there any basis to my thinking, or am I merely unreasonable and/or ignorant? Even if I only use one or two features of a large library, by linking to that library am I going to incur runtime performance overheads?

No basis, more or less.
You can test it yourself.

Write a small C++ program and compile it. Now add a new function to it, one which is never called, but is defined. Compile the program again. Assuming optimizations are enabled, it gets stripped out by the linker because it's unused. So the cost of including additional unused code is zero.

Of course there are exceptions. If the code instantiates any global objects, those might not be removed (that's why including the iostream header increases the executable size), but in general, you can include as many headers and link to as many libraries as you like, and it won't affect the size, performance or memory usage of your program *as long as you don't use any of the added code.

Another exception is that if you dynamically link to a .dll or .so, the entire library must be distributed, and so it can't be stripped of unused code. But libraries that are statically compiled into your executable (either as static libraries (.lib or .a) or just as included header files can usually be trimmed down by the linker, removing unused symbols.

晒暮凉 2024-08-27 18:22:33

代码性能角度来看,大型库将

  • 占用更多内存,如果它有运行时二进制文件(boost的大部分部分不需要运行时二进制文件,它们是“仅标头”)。虽然操作系统只会将库中实际使用的部分加载到 RAM,但它仍然可以加载超出您需要的内容,因为加载内容的粒度等于页面大小(但在我的系统上仅 4 Kb)。
  • 如果再次需要运行时二进制文件,

    需要更多时间通过动态链接器加载。每次加载程序时,动态链接器都必须将您需要外部库包含的每个函数与其在内存中的实际地址相匹配。这需要一些时间,但只是一点点(但是,对于加载许多程序的规模来说,这很重要,例如桌面环境的启动,但你在那里别无选择)。

    是的,每次调用共享(动态链接)库的外部函数时,在运行时都需要一次额外的跳转和几次指针调整

是的,从开发人员的性能角度来看, :

  • 添加外部依赖项。你将依赖别人。即使该库是免费软件,您也需要额外的费用来修改它。一些非常低级程序(我指的是操作系统内核)的开发人员讨厌依赖任何人——这就是他们的职业福利。因此,咆哮。

    但是,这可以被视为一种好处。如果其他人习惯了boost,他们会在您的程序中找到熟悉的概念和术语,并且会更有效地理解和修改它。

  • 较大的库通常包含特定于库的概念,需要时间来理解。考虑 Qt。它包含信号和槽以及与moc相关的基础设施。与整个 Qt 的大小相比,学习它们只需要一小部分时间。但如果您使用如此大的库的一小部分,这可能是一个问题。

Large library will, from the code performance perspective:

  • occupy more memory, if it has a runtime binary (most parts of boost don't require runtime binaries, they're "header-only"). While the OS will load only the actually used parts of the library to RAM, it still can load more than you need, because the granularity of what's loaded is equal to page size (4 Kb only on my system, though).
  • take more time to load by dynamic linker, if, again, it needs runtime binaries. Each time your program is loaded, dynamic linker has to match each function you need external library to contain with its actual address in memory. It takes some time, but just a little (however, it matters at a scale of loading many programs, such as startup of desktop environment, but you don't have a choice there).

    And yes, it will take one extra jump and a couple of pointer adjustments at runtime each time you call external function of a shared (dynamically linked) library

from a developer's performance perspective:

  • add an external dependency. You will be depending on someone else. Even if that library's free software, you'll need extra expense to modify it. Some developers of veeery low-level programs (I'm talking about OS kernels) hate to rely on anyone--that's their professional perk. Thus the rants.

    However, that can be considered a benefit. If other people are gotten used to boost, they will find familiar concepts and terms in your program and will be more effective understanding and modifying it.

  • Bigger libraries usually contain library-specific concepts that take time to understand. Consider Qt. It contains signals and slots and moc-related infrastructure. Compared to the size of the whole Qt, learning them takes a small fraction of time. But if you use a small part of such a big library, that can be an issue.

芸娘子的小脾气 2024-08-27 18:22:33

过多的代码并不会神奇地使处理器运行速度变慢。它所做的只是坐在那里,占用一点内存。

如果您进行静态链接并且您的链接器完全合理,那么它将仅包含您实际使用的函数。

Excess code doesn't magically make the processor run slower. All it does is sit there occupying a little bit of memory.

If you're statically linking and your linker is at all reasonable, then it will only include the functions that you actually use anyway.

浴红衣 2024-08-27 18:22:33

我喜欢用平台技术来形容框架、库集和某些类型的开发工具。平台技术的成本超出了代码大小和性能的影响。

  1. 如果您的项目本身旨在用作库或框架,那么您最终可能会将您的平台技术选择推送给使用您的库的开发人员。

  2. 如果您以源代码形式分发项目,您最终可能会将平台技术选择推送给最终用户。

    如果您以源代码形式分发项目,您最终

  3. 如果您不静态链接所有选择的框架和库,最终可能会给最终用户带来库版本控制问题的负担。

    如果您不静态链接所有

  4. 编译时间会影响开发人员的工作效率。增量链接、预编译标头、适当的标头依赖项管理等可以帮助管理编译时间,但不能消除与某些平台技术引入的大量内联代码相关的编译器性能问题。

  5. 对于作为源代码分发的项目,编译时间会影响项目的最终用户。

    对于作为源

  6. 许多平台技术都有自己的开发环境要求。这些要求可能会累积起来,使得项目的新开发人员很难复制允许编译和调试所需的环境。

  7. 使用一些平台技术实际上为该项目创建了一种新的编程语言。这使得新开发人员更难做出贡献。

所有项目都具有平台技术依赖性,但对于许多项目来说,将这些依赖性保持在最低限度确实有好处。

The term I like for frameworks, library sets, and some types of development tools, is platform technologies. Platform technologies have costs beyond impact on code size and performance.

  1. If your project is itself intended to be used as a library or framework, you may end up pushing your platform technology choices on developers that use your library.

  2. If you distribute your project in source form, you may end up pushing platform technology choices on your end users.

  3. If you do not statically link all your chosen frameworks and libraries, you may end up burdening your end users with library versioning issues.

  4. Compile time effects developer productivity. Incremental linking, precompiled headers, proper header dependency management, etc., can help manage compile times, but do not eliminate the compiler performance problems associated with the massive amounts of inline code some platform technologies introduce.

  5. For projects that are distributed as source, compile time affects the end users of the project.

  6. Many platform technologies have their own development environment requirements. These requirements can accumulate making it difficult and time consuming for new developers on a project to be able to replicate the environment needed to allow compiling and debugging.

  7. Using some platform technologies in effect creates a new programming language for the project. This makes it harder for new developers to contribute.

All projects have platform technology dependencies, but for many projects there are real benefits to keeping these dependencies to a minimum.

把回忆走一遍 2024-08-27 18:22:33

如果这些库是动态链接的,则加载这些库时可能会产生少量开销。这通常只是程序运行时间的一小部分。

然而,一旦所有内容都加载完毕,就不会产生任何开销。

如果您不想使用 全部 的 boost,那就不要使用。它是模块化的,因此您可以使用您想要的部分而忽略其余部分。

There may be a small overhead when loading these libraries if they're dynamically linked. This will typically be a tiny, tiny fraction of the time your program spends running.

However there will be no overhead once everything is loaded.

If you don't want to use all of boost, then don't. It's modular, so you can use the parts you want and ignore the rest.

旧城空念 2024-08-27 18:22:33

更大并不意味着更慢。与其他一些答案相反,完全存储在标头中的库和存储在目标文件中的库之间也没有固有的区别。

仅头文件库可以具有间接优势。大多数基于模板的库必须只有标头(或者很多代码最终都在标头中),并且模板确实提供了很多优化的机会。然而,将代码放入典型的目标文件库中并将其全部移至标头中通常不会产生许多良好的效果(并且可能导致代码膨胀)。

特定库的真正答案通常取决于其整体结构。人们很容易认为“Boost”是一个巨大的东西。事实上,它是一个巨大的库集合,其中大多数库都非常小。作为一个整体,你不能对 Boost 说太多(有意义的),因为各个库是由不同的人编写的,具有不同的技术、目标等。其中一些库(例如 Format、Assign)确实比几乎任何库都慢你很可能会自己做。其他的(例如 Pool)提供了您可以自己做的事情,但可能不会,以获得至少较小的速度改进。少数人(例如 uBlas)使用重型模板魔法来运行得比我们中的一小部分人希望自己达到的速度更快。

当然,有相当多的库确实是单独的大型库。在相当多的情况下,这些确实比您自己编写的要慢。特别是,其中许多(大多数?)试图比您自己编写的几乎任何内容都更加笼统。虽然这并不一定会导致代码变慢,但在这个方向上肯定存在着强烈的趋势。与许多其他代码一样,当您以商业方式开发库时,客户往往对功能更感兴趣,而不是诸如速度大小之类的东西。

有些库还投入了大量的空间、代码(通常至少是一些时间)来解决您可能根本不关心的问题。举个例子,几年前我使用了一个图像处理库。它对 200 多种图像格式的支持听起来确实令人印象深刻(在某种程度上确实如此),但我很确定我从未用它来处理超过十几种格式(而且我可能只支持其中一半就可以了)许多)。 OTOH,尽管如此,它仍然相当快。支持较少的市场可能会限制其市场,以至于代码实际上会更慢(例如,它处理 JPEG 的速度比 IJG 快)。

Bigger doesn't inherently imply slower. Contrary to some of the other answers, there's no inherent difference between libraries stored entirely in headers and libraries stored in object files either.

Header-only libraries can have an indirect advantage. Most template-based libraries have to be header-only (or a lot of the code ends up in headers anyway), and templates do give a lot of opportunities for optimization. Taking code in a typical object-file library and moving it all into headers will not, however, usually have many good effects (and could lead to code bloat).

The real answer for a particular library will usually depend on its overall structure. It's easy to think of "Boost" as something huge. In fact, it's a huge collection of libraries, most of which are individually quite small. You can't say very much (meaningfully) about Boost as a whole, because the individual libraries are written by different people, with different techniques, goals, etc. A few of them (e.g. Format, Assign) really are slower than almost anything you'd be very likely to do on your own. Others (e.g. Pool) provide things you could do yourself, but probably won't, to get at least minor speed improvements. A few (e.g. uBlas) use heavy-duty template magic to run faster than any but a tiny percentage of us can hope to achieve on our own.

There are, of course, quite a few libraries that really are individually large libraries. In quite a few cases, these really are slower than what you'd write yourself. In particular, many (most?) of them attempt to be much more general than almost anything you'd be at all likely to write on your own. While that doesn't necessarily lead to slower code, there's definitely a strong tendency in that direction. Like with a lot of other code, when you're developing libraries commercially, customers tend to be a lot more interested in features than things like size of speed.

Some libraries also devote a lot of space, code (and often at least bits of time) to solving problems you may very well not care about at all. Just for example, years ago I used an image processing library. Its support for 200+ image formats sounded really impressive (and in a way it really was) but I'm pretty sure I never used it to deal with more than about a dozen formats (and I could probably have gotten by supporting only half that many). OTOH, even with all that it was still pretty fast. Supporting fewer markets might have restricted their market to the point that the code would actually have been slower (just for example, it handled JPEGs faster than IJG).

-柠檬树下少年和吉他 2024-08-27 18:22:33

正如其他人所说,添加动态库时会产生一些开销。首次加载库时,必须执行重定位,尽管如果库编译正确,这应该是很小的成本。由于需要搜索的库的数量增加,因此查找单个符号的成本也增加。

添加另一个动态库的内存成本很大程度上取决于您实际使用的内存量。在执行某页代码之前,不会从磁盘加载该代码页。然而,库文件中内置的其他数据,如头、符号表和哈希表也会被加载,这些数据通常与库的大小成正比。

glibc 的主要贡献者 Ulrich Drepper 有一个 很棒的文档,其中描述了进程和动态库的开销。

As others have said, there is some overhead when adding a dynamic library. When the library is first loaded, relocations must be performed, although this should be a minor cost if the library is compiled correctly. The cost of looking up individual symbols is also increased since the number of libraries that need to be searched is increased.

The cost in memory of adding another dynamic library depends largely on how much of it you actually use. A page of code will not be loaded from disk until something on it is executed. However, other data such as headers, symbol tables, and hash tables built into the library file will be loaded, and these are generally proportional to the size of the library.

There is a great document by Ulrich Drepper, the lead contributor to glibc, that describes the process and the overhead of dynamic libraries.

眉黛浅 2024-08-27 18:22:33

取决于链接器如何工作。有些链接器是惰性的,会包含库中的所有代码。更高效的链接器只会从库中提取所需的代码。我对这两种类型都有过经验。

较小的库对任何类型的链接器都不会那么担心。对于小型库来说,最糟糕的情况是少量未使用的代码。许多小型库可能会增加构建时间。权衡是构建时间与代码空间。

对链接器的一个有趣的测试是经典的 Hello World 程序:

#include <stdio>
#include <stdlib>
int main(void)
{
  printf("Hello World\n");
  return EXIT_SUCCESS;
}

由于 可能 的所有格式,printf 函数有很多依赖项需要。懒惰但快速的链接器可能包含一个“标准库”来解析所有符号。更高效的库将仅包含 printf 及其依赖项。这使得链接器变慢。

上面的程序可以与使用 puts 的程序进行比较:

#include <stdio>
#include <stdlib>
int main(void)
{
  puts("Hello World\n");
  return EXIT_SUCCESS;
}

通常,puts 版本应该小于 printf 版本,因为 puts 没有格式化需求,因此依赖性更少。惰性链接器将生成与 printf 程序相同的代码大小。

总之,库大小的决定更多地依赖于链接器。具体来说,是链接器的效率。当有疑问时,许多小型库将减少对链接器效率的依赖,但会使构建过程更加复杂和缓慢。

Depends on how the linker works. Some linkers are lazy and will include all the code in library. The more efficient linkers will only extract the needed code from a library. I have had experience with both types.

Smaller libraries will have less worries with either type of linker. Worst case with a small library is small amounts of unused code. Many small libraries may increase the build time. The trade off would be build time vs. code space.

An interesting test of the linker is the classic Hello World program:

#include <stdio>
#include <stdlib>
int main(void)
{
  printf("Hello World\n");
  return EXIT_SUCCESS;
}

The printf function has a lot of dependencies due to all the formatting that it may need. A lazy, but fast linker may include a "standard library" to resolve all the symbols. A more efficient library will only include printf and its dependencies. This makes the linker slower.

The above program can be compared to this one using puts:

#include <stdio>
#include <stdlib>
int main(void)
{
  puts("Hello World\n");
  return EXIT_SUCCESS;
}

Generally, the puts version should be smaller than the printf version, because puts has no formatting needs thus less dependencies. Lazy linkers will generate the same code size as the printf program.

In summary, library size decisions have more dependencies on the linker. Specifically, the efficiency of the linker. When in doubt, many small libraries will rely less on the efficiency of the linker, but make the build process more complicated and slower.

空城旧梦 2024-08-27 18:22:33
  1. 一般而言,与性能问题相关的事情不是要娱乐它们,因为这样做就是猜测它们是一个问题,因为如果您不知道它们是,你就在猜测,而猜测是“过早优化”背后的核心概念。处理性能问题的关键在于,当您遇到性能问题时,而不是之前,诊断它们。这些问题几乎从来都不是你能猜到的。 。

  2. 如果您这样做在相当多的时间里,您将逐渐认识到往往会导致性能问题的设计方法,无论是在您的代码中还是在库中。 (库肯定会存在性能问题。)当您了解这一点并将其应用到项目中时,在某种意义上您就过早地进行了优化,但无论如何它已经达到了避免问题的预期效果。 我可以总结一下您可能会学到的内容,那就是太多的抽象层和过度的类层次结构(尤其是那些充满通知式更新的类层次结构)通常是导致性能问题的原因。

同时,我也同意您对第三方库等的谨慎态度。我曾多次参与过一些项目,其中一些第三方软件包被“利用”以实现“协同”,然后供应商要么灰飞烟灭,要么放弃该产品,或者因为微软改变了操作系统中的内容而使其过时。然后,我们严重依赖第 3 方软件包的产品开始无法运行,需要我们投入大量资金,而原来的程序员早已离开。

  1. The thing to do with performance concerns, in general, is not to entertain them, because to do so is to be guessing that they are a problem, because if you don't know they are, you are guessing, and guessing is the central concept behind "premature optimization". The thing to do with performance problems is, when you have them, and not before, diagnose them. The problems are almost never something you would have guessed. Here's an extended example.

  2. If you do that a fair amount, you will come to recognize the design approaches that tend to cause performance problems, whether in your code or in a library. (Libraries can certainly have performance problems.) When you learn that and apply it to projects then in a sense you are prematurely optimizing, but it has the desired effect anyway, of avoiding problems. If I can summarize what you will probably learn, it is that too many layers of abstraction, and overblown class hierarchies (especially those full of notification-style updating) are what are very often the reasons for performance problems.

At the same time, I share your circumspection about 3rd-party libraries and such. Too many times I have worked on projects where some 3rd-party package was "leveraged" for "synergy", and then the vendor either went up in smoke or abandoned the product or had it go obsolete because Microsoft changed things in the OS. Then our product that leaned heavily on the 3rd-party package starts not working, requiring a big expenditure on our part while the original programmers are long gone.

一念一轮回 2024-08-27 18:22:33

“另一个球和链”。真的吗?

或者它是一个稳定、可靠的平台,可以首先支持您的应用程序?

考虑到有些人可能喜欢“太大且……臃肿”的库,因为他们将其用于其他项目并且真正信任它。

事实上,他们可能会拒绝干扰您的软件,特别是因为您避免使用明显的“太大且……臃肿”的库。

"another ball and chain". Really?

Or is it a stable, reliable platform that enables your application in the first place?

Consider that some folks may like a "too big and ... bloated" library because they use it for other projects and really trust it.

Indeed, they may decline to mess with your software specifically because you avoided using the obvious "too big and ... bloated" library.

翻了热茶 2024-08-27 18:22:33

从技术上讲,答案是肯定的。然而,这些低效率实际上非常很少重要。我在这里假设使用 C、C++ 或 D 等静态编译语言。

当可执行文件加载到现代操作系统的内存中时,地址空间只是映射到它。这意味着,无论可执行文件有多大,如果有整个页面大小的代码块未使用,它们将永远不会触及物理内存。不过,您会浪费地址空间,有时这在 32 位系统上可能会有点影响。

当您链接到库时,好的链接器通常会丢弃您不使用的多余内容,尽管特别是在模板实例化的情况下,这种情况并不总是发生。因此,您的二进制文件可能比严格需要的要大一些。

如果您不经常使用的代码与确实使用的代码交错,则最终可能会浪费 CPU 缓存中的空间。然而,由于缓存行很小(通常为 64 字节),因此这种情况很少会发生到实际重要的程度。

Technically, the answer is that yes, they do. However, these inefficiencies are very seldom practically important. I'm going to assume a statically compiled language like C, C++, or D here.

When an executable is loaded into memory on a modern OS, address space is simply mapped to it. This means that, no matter how big the exectable is, if there are entire page-size blocks of code that aren't used, they will never touch physical memory. You will waste address space, though, and occasionally this can matter a little on 32-bit systems.

When you link to a library, a good linker will usually throw out excess stuff that you don't use, though especially in the case of template instantiations this doesn't always happen. Thus your binaries might be a little bit bigger than strictly necessary.

If you have code that you don't use heavily interleaved with code that you do use, you can end up wasting space in your CPU cache. However, as cache lines are small (usually 64 bytes), this will seldom happen to a practically important extent.

椒妓 2024-08-27 18:22:33

问问自己你的目标是什么。它是当今的中端工作站吗?没问题。如果是较旧的硬件,甚至是有限的嵌入式系统,那么它可能是。

正如之前的发帖者所说,仅在那里放置代码不会对性能造成太大影响(它可能会减少缓存的局部性并增加加载时间)。

Ask yourself what your target is. Is it a mid end workstation of today - no problem. Is it older hardware or even a limited embedded system, then it might be.

As previous posters have said, just having the code there does not cost you much in performance (it might reduce the locality for the caches and increase loading times).

吃兔兔 2024-08-27 18:22:33

fwiw,我在 Microsoft Windows 上工作,当我们构建 Windows 时;针对 SIZE 编译的构建比针对 SPEED 编译的构建更快,因为页面错误命中次数更少。

fwiw, I work on Microsoft Windows and when we build Windows; build compiled for SIZE are faster than builds compiled for SPEED because you take fewer page fault hits.

伴梦长久 2024-08-27 18:22:33

FFTW 和 ATLAS 是两个相当大的库。奇怪的是,它们在世界上最快的软件和为在超级计算机上运行而优化的应用程序中发挥着重要作用。不,使用大型库不会使您的代码变慢,尤其是当您自己实现 FFT 或 BLAS 例程时。

FFTW and ATLAS are two quite large libraries. Oddly enough, they play large roles in the fastest software in the world, applications optimized to run on supercomputers. No, using large libraries doesn't make your code slow, especially when the alternative is implementing FFT or BLAS routines for yourself.

み青杉依旧 2024-08-27 18:22:33

你的担心是有道理的,尤其是在提升的时候。这并不是因为写这些文章的人不称职,而是因为两个问题。

  1. 模板本质上只是臃肿的代码。这在 10 年前并不重要,但如今 CPU 的速度比内存访问快得多,并且这种趋势仍在继续。我几乎想说模板是一个过时的功能。

对于通常有些实用的用户代码来说,这并不是那么糟糕,但在许多库中,所有内容都是根据其他模板或多个项目上的模板来定义的(意味着指数模板代码爆炸)。

只需添加 iostream 即可为您的代码增加大约 3 mb (!!!)。现在添加一些 boost 废话,如果您简单地声明几个特别奇怪的数据结构,您就会有 30 mb 的代码。

更糟糕的是,您甚至无法轻松地对此进行分析。我可以告诉你我编写的代码和模板库中的代码之间的区别是巨大的,但对于更天真的方法,你可能会认为你在简单的测试中做得更糟,但代码膨胀的成本将在大型现实世界中发挥其作用应用程序。

  1. 复杂。当你查看 Boost 中的内容时,它们都会使你的代码在很大程度上变得复杂。像智能指针、函子、各种复杂的东西。现在,我不会说使用这些东西从来都不是一个好主意,但几乎所有这些东西都有某种巨大的成本。特别是如果你不明白,我的意思是,它到底在做什么。

但人们对它赞不绝口,并假装它与“设计”有关,这样人们就会觉得这是你做所有事情都应该采用的方式,而不仅仅是一些应该很少使用的极其专业的工具。如果有的话。

You are very right to be worried, especially when it comes to boost. It's not so much due to anyone writing them being incompetent but due to two issues.

  1. Templates are just inherently bloated code. This didn't matter as much 10 years ago, but nowadays the CPU is much faster than memory access and this trend continues. I'd almost say templates are an obsolescent feature.

It's not so bad for user code which is usually somewhat practical, but in many libraries everything is defined in terms of other templates or template on on multiple items (meaning exponential template code explosions).

Simply adding in iostream adds about 3 mb (!!!) to your code. Now add in some boost nonsense and you have 30 mb of code if you sinply declare a couple of particularly weird data structures.

Worse, you can't even easily profile this. I can tell you the difference between code written by me and code from template libraries is DRAMATIC but for a more naieve approach you may think you are doing worse from a simple test, but the cost in code bloat will take its tool in a large realworld app.

  1. Complexity. When you look at the things in Boost, they are all things that complicate your code to a huge degree. Things like smart pointers, functors, all sorts of complicated stuff. Now, I won't say it's never a good idea to use this stuff, but pretty much all of it has a big cost of some kind. Especially if you don't understand exactly, I mean exactly, what it's doing.

But people rave about it and pretend it has something to do with 'design' so people get the impression it is the way you should do everything, not just some extremely specialized tools that should be used seldom. If ever.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文