使用 STL 是否会显着增加占用空间?

发布于 2024-07-10 02:48:37 字数 61 浏览 5 评论 0原文

使用 STL 是否会显着增加占用空间? 各位能分享一下关于这件事的经验吗? 构建小型库的最佳实践是什么?

Does using STL increase footprint significantly? Could you guys share your experience regarding this matter? What are the best practices to build a small footprint library?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

歌入人心 2024-07-17 02:48:38

虽然它不涉及 STL 模板,但 GCC 的文档有一小节介绍如何在使用模板时最大限度地减少代码膨胀。

链接是
http://gcc.gnu。 org/onlinedocs/gcc-4.3.2/gcc/Template-Instantiation.html#Template-Instantiation

While it does not address STL templates, the documentation for GCC has short section on minimizing code bloat when using templates templates.

The link is
http://gcc.gnu.org/onlinedocs/gcc-4.3.2/gcc/Template-Instantiation.html#Template-Instantiation

っ〆星空下的拥抱 2024-07-17 02:48:38

暂时忽略 STL 的讨论,构建低空间占用静态库有一个不明显的重要实践。 将您的库划分为尽可能多的不同编译单元。 例如,如果您查看 libpthread.a,您会发现每个函数都有自己的编译单元。 许多链接器会根据整个编译单元抛出死代码,但不会比这更细粒度。 如果我只使用 pthreads 库中的几个函数,我的链接器将只引入这些定义,而不会引入其他任何东西。 另一方面,如果整个库被编译成单个目标文件,我的链接器将必须将整个库作为单个“单元”引入。

这取决于您使用的工具链,并且仅适用于构建静态库。 但我已经看到它对于次优构建的大型库产生了非常明显的影响。

Ignoring the STL discussion for the moment, there is one non-obvious important practice for building a low-space-footprint static library. Partition your library into as many disparate compilation units as you can. For example, if you look at libpthread.a, you'll see that every single function has its own compilation unit. Many linkers will throw out dead code based on whole compilation units, but not any more fine-grained than that. If I only use a few functions from the pthreads library, my linker will bring in only those definitions, and nothing else. If, on the other hand, the entire library were compiled into a single object file, my linker would have to bring in the entire library as a single "unit".

This depends on the toolchain you're using, and it only applies if you're building static libraries. But I have seen it make a very measurable difference for large libraries built sub-optimally.

夢归不見 2024-07-17 02:48:38

基于指针的 STL 特化可以共享相同的实现。 这是因为 (void *) 与 (int *) 或 (foo *) 具有相同的大小。 所以 while:

vector⟨int⟩ 和 vector⟨foo⟩ 是不同的实现。

vector⟨int *⟩ 和 vector⟨foo *⟩ 可以共享大部分相同的实现。 许多 STL 实现这样做是为了节省内存占用。

如果整个模板在头文件中定义,那么它是完整定义的,像 g++ 这样的编译器将自动使用该类在每个编译单元中创建一个副本。 正如其他人所说,链接器将删除多个定义。

他们还将自动内联类定义中的方法,以实现更高的优化级别。 但这可以通过编译器选项来控制。

STL specializations based on pointers can share the same implementation. This is since (void *) has the same size as (int *) or (foo *). So while:

vector⟨int⟩ and vector⟨foo⟩ are different implementations.

vector⟨int *⟩ and vector⟨foo *⟩ can share much of the same implementation. Many STL implementations do this to save memory footprint.

If the entire template is defined in a header so it is completely defined, compilers like g++ will automatically create a copy in each compilation unit using the class. As the others said, the linker will remove multiple definitions.

They will also automatically inline the methods in the class definition, for the higher optimization levels. But this can be controlled with compiler options.

云淡月浅 2024-07-17 02:48:38

在我曾经做过的一个限制为 64kb 的嵌入式项目中,我什至无法链接标准 C 库。 所以这取决于你需要做什么

On an embedded project with a 64kb limit I once did, I couldn't even link the standard C libraries. So it depends on what you need to do

苏辞 2024-07-17 02:48:37

由于 STL 是一组模板,因此没有唯一的答案。 就其本质而言,模板仅在使用时才进行编译。 因此,您可以包含所有 STL,如果没有实际使用,则 STL 添加的占用空间将为零。 如果您有一个非常小的应用程序,并且设法使用许多具有不同专业化的不同模板,则占用空间可能会很大。

There's no one answer since STL is a set of templates. Templates, by their very nature, are only compiled in when used. So you can include all of STL and if none of it is actually used, the footprint added by STL will be zero. If you have a very small app that manages to use a lot of different templates with different specializations, the footprint can be large.

初吻给了烟 2024-07-17 02:48:37

STL 的特点是,它都是模板,只会在您实际使用时增加尺寸。 如果您不使用某个方法,则该方法不会被实例化。

但你使用的东西总会有成本。

但真正的问题你必须问。 尺寸会比您自己的实现更大还是更小? 如果你不使用STL,你会使用什么? 您可以自己编写,但这有其自身的成本。 它不是零空间,不会经过良好的测试,并且您不会遵循既定的最佳实践。

所以实际上它不会让你的代码变得臃肿。
因为通过其他方法添加等效功能将添加同样多的代码,但它不会得到很好的测试。

其他帖子指出模板代码必须内联或具有多个定义。
这绝对是错误

这些方法被标记为内联。 但该方法是否实际内联取决于编译器。 编译器内联足够复杂,只有在有助于使用优化策略时才进行内联。

如果不是内联的,那么将在使用该方法的每个编译单元中生成该方法的副本。 但 C++ 链接器的一个要求是,当应用程序链接到可执行文件时,它必须删除这些方法的所有副本(仅保留一个副本)。 如果编译器没有删除额外的副本,它将不得不生成多重定义链接器错误。

这很容易显示:
下图说明:

  • _M_insert_aux() 方法不是内联的。
  • 它被放置在两个编译单元中。
  • 最终可执行文件中只有该方法的一个副本。

a.cpp

#include <vector>

void a(std::vector<int>& l)
{
    l.push_back(1);
    l.at(0) = 2;
}

b.cpp

#include <vector>

void b(std::vector<int>& l)
{
    l.push_back(1);
    l.at(0) = 2;
}

main.cpp

#include <vector>

void a(std::vector<int>&);
void b(std::vector<int>&);

int main()
{
    std::vector<int>    x;
    a(x);
    b(x);
}

检查

>g++ -c a.cpp
>g++ -c b.cpp

>nm a.o
<removed other stuff>
000000a0 S __ZNSt6vectorIiSaIiEE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPiS1_EERKi
<removed other stuff>

>nm b.o
<removed other stuff>
000000a0 S __ZNSt6vectorIiSaIiEE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPiS1_EERKi
<removed other stuff>

>c++filt __ZNSt6vectorIiSaIiEE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPiS1_EERKi
std::vector<int, std::allocator<int> >::_M_insert_aux(__gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > >, int const&)

>g++ a.o b.o main.cpp
nm a.out | grep __ZNSt6vectorIiSaIiEE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPiS1_EERKi
00001700 T __ZNSt6vectorIiSaIiEE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPiS1_EERKi

The thing about the STL because it is all templates it only adds to the size when you actually use. If you don't use a method then that method is not instantiated.

But there will always be a cost for the things you use.

But the real question you have to ask. Is the size going to be bigger or smaller than yourt own implementation? If you don't use the STL what do you use? You can hand write your own but that has its own cost. It is not zero space, it will not be well tested and you will not be following the established best practices.

So in reality no it does not bloat your code.
Because to add the equivalent functionality by some other method will add just as much code it just will not be as well tested.

Other post states that template code must be in-lined or have multiple definitions.
This is absolutely WRONG.

The methods are marked as in-line. But it is up to the compiler if the method is actually in-lined or not. The compiler inlining is sophisticated enough to only in-line if this will help in the optimization stratergy being used.

If not in-lined then a copy of the method will be generated in every compilation unit that uses the method. But a REQUIREMENT for a C++ linker is that it must remove ALL but ONE copy of these methods when the application is linked into an executable. If the compiler did not remove the extra copies it would have to generate a multiple definition linker error.

This is easy to show:
The following illustrates that:

  • The method _M_insert_aux() is not in-lined.
  • It is placed in both compilation units.
  • That only one copy of the method is in the final executable.

a.cpp

#include <vector>

void a(std::vector<int>& l)
{
    l.push_back(1);
    l.at(0) = 2;
}

b.cpp

#include <vector>

void b(std::vector<int>& l)
{
    l.push_back(1);
    l.at(0) = 2;
}

main.cpp

#include <vector>

void a(std::vector<int>&);
void b(std::vector<int>&);

int main()
{
    std::vector<int>    x;
    a(x);
    b(x);
}

Checking

>g++ -c a.cpp
>g++ -c b.cpp

>nm a.o
<removed other stuff>
000000a0 S __ZNSt6vectorIiSaIiEE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPiS1_EERKi
<removed other stuff>

>nm b.o
<removed other stuff>
000000a0 S __ZNSt6vectorIiSaIiEE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPiS1_EERKi
<removed other stuff>

>c++filt __ZNSt6vectorIiSaIiEE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPiS1_EERKi
std::vector<int, std::allocator<int> >::_M_insert_aux(__gnu_cxx::__normal_iterator<int*, std::vector<int, std::allocator<int> > >, int const&)

>g++ a.o b.o main.cpp
nm a.out | grep __ZNSt6vectorIiSaIiEE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPiS1_EERKi
00001700 T __ZNSt6vectorIiSaIiEE13_M_insert_auxEN9__gnu_cxx17__normal_iteratorIPiS1_EERKi
柒七 2024-07-17 02:48:37

使用 STL 会增加二进制文件的大小,原因有两个:

内联

默认情况下,编译器将模板代码视为内联。 因此,如果您在多个不同的编译单元中使用 std::list并且编译器内联该代码,它们将各自拥有自己的本地内联定义std::list 函数(这可能不是什么大问题,因为默认情况下编译器只会内联非常小的定义)。

请注意(正如 Martin 在其他地方指出的那样)多重定义的符号被大平台上的所有现代 C++ 链接器删除,如 GCC 文档。 因此,如果编译器使模板代码脱离行,链接器将删除重复项。

专业化

由于 C++ 模板的本质,std::list 可能与 std::list 存在任意不同。 事实上,标准要求将 std::vector 定义为位向量,因此大多数操作完全与默认不同>std::vector

只有库维护者可以处理这个问题。 一种解决方案是采用核心功能并对其进行“非模板化”。 把它变成C风格的数据结构,到处都是void*。 那么,下游开发者看到的模板接口就是一个薄包装器。 这减少了重复代码的数量,因为模板专业化都共享一个共同的基础。

Using the STL will increase your binary size for two basic reasons:

Inlining

By default, the compiler treats template code as inline. Therefore, if you use std::list<int> in several different compilation units, and the compiler inlines that code, they'll each have their own local inline definitions of std::list<int> functions (which probably isn't a big deal, since the compiler will only inline very small definitions by default).

Note that (as Martin points out elsewhere) multiply-defined symbols are stripped out by all modern C++ linkers on the big platforms, as described in the GCC documentation. Thus, if the compiler leaves template code out-of-line, the linker will remove duplicates.

Specialization

Because of of the very nature of C++ templates, std::list<int> is potentially arbitrarily different than std::list<double>. In fact, the standard mandates that std::vector<bool> is defined as a bit-vector, so most of the operations there are completely different than the default std::vector<T>.

Only the library maintainer can deal with this. One solution is to take the core functionality and "un-templatize" it. Turn it into a C-style data structure with void* everywhere. Then, the template interface that downstream developers see is a thin wrapper. The reduces the amount of duplicated code because the template specializations all share a common basis.

何以笙箫默 2024-07-17 02:48:37

我假设您指的是运行时内存占用空间,因此是STL容器

STL 容器就其本身而言是高效的……通用容器。 如果您正在决定编写自己的双向链表还是使用 std::list,请...使用 STL。 如果您正在考虑为每个特定需求编写非常特定于领域的、位封装的容器,请首先使用 STL,然后在所有代码正常工作后选择您的战斗。

一些好的做法:

  • 如果您的库要通过 API 公开这些容器,您可能必须选择将 STL 代码放入库标头中还是不使用 STL。 问题是我的编译器不必像您的编译器那样实现 STL。
  • 了解 STL 如何以及何时 实现
    容器分配内存。 什么时候
    你可以想象双端队列如何增长
    与向量相比缩小,
    您将更好地决定使用哪个。
  • 如果您需要对每个字节进行微观管理,请考虑编写自定义分配器。 在嵌入式系统之外很少需要这样做。

I assume you mean runtime memory footprint, and thusly STL containers.

STL containers are efficient for what they are...general purpose containers. If you are deciding between writing your own doubly-linked list or using std::list, please...use STL. If you are considering writing very domain-specific, bit-packed containers for each of your specific needs, use STL first and then choose your battles once all your code is working correctly.

Some good practices:

  • If your library is going to expose these containers via the API, you may have to choose between putting your STL code in library headers or not using STL. The problem is that my compiler doesn't have to implement the STL the same way yours did.
  • Read up on how and when STL
    containers allocate memory. When
    you can visualize how a deque grows
    and shrinks compared to a vector,
    you will be better prepared to decide which to use.
  • If you need to micro-manage every byte, consider writing custom allocators. This is rarely necessary outside of embedded systems.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文