良好的 C 标头样式

发布于 2024-10-26 09:02:12 字数 654 浏览 1 评论 0原文

我的 C 头文件通常类似于以下样式以避免多重包含：

#ifndef <FILENAME>_H
#define <FILENAME>_H

// define public data structures / prototypes, macros etc.

#endif  /* !<FILENAME>_H */

但是，在他的 Notes onProgramming in C，Rob Pike 对头文件提出了以下论点：

有一个涉及 #ifdef 的小技巧可以防止文件被读取两次，但在实践中通常会出错 - #ifdef 位于文件本身，而不是包含它的文件。结果通常是数千行不必要的代码通过词法分析器，这是（在好的编译器中）最昂贵的阶段。

一方面，Pike 是我真正钦佩的唯一程序员。另一方面，将多个 #ifdef 放入多个源文件中，而不是将一个 #ifdef 放入单个头文件中，会感觉不必要的尴尬。

处理多重包含问题的最佳方法是什么？

原文

My C headers usually resemble the following style to avoid multiple inclusion:

#ifndef <FILENAME>_H
#define <FILENAME>_H

// define public data structures / prototypes, macros etc.

#endif  /* !<FILENAME>_H */

However, in his Notes on Programming in C, Rob Pike makes the following argument about header files:

There's a little dance involving #ifdef's that can prevent a file being read twice, but it's usually done wrong in practice - the #ifdef's are in the file itself, not the file that includes it. The result is often thousands of needless lines of code passing through the lexical analyzer, which is (in good compilers) the most expensive phase.

On the one hand, Pike is the only programmer I actually admire. On the other hand, putting several #ifdefs in multiple source files instead of putting one #ifdef in a single header file feels needlessly awkward.

What is the best way to handle the problem of multiple inclusion?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

泛滥成性 2024-11-02 09:02:12

在我看来，使用需要较少时间的方法（这可能意味着将 #ifdefs 放入头文件中）。如果我生成的代码更干净，我真的不介意编译器是否必须更加努力地工作。如果您正在开发一个数百万行的代码库，并且需要不断地完全重建，那么额外的节省也许是值得的。但在大多数情况下，我怀疑额外的成本通常并不明显。

回复收藏 0 原文

还如梦归 2024-11-02 09:02:12

继续做你所做的事情 - 它很清晰，不易出现错误，并且为编译器编写者所熟知，因此不像一两年前那样效率低下。

您可以使用非标准的 #pragma Once - 如果您进行搜索，可能至少有一个书架的关于 include Guards 与 pragma Once 的讨论，所以我不会推荐其中一个。

回复收藏 0 原文

夜访吸血鬼 2024-11-02 09:02:12

Pike 在 https://talks.golang.org/2012/splash.article< 中写了更多相关内容/a>:

1984 年，Unix ps 命令的源代码 ps.c 的编译被发布
观察到 #include 37 次
预处理已经完成。即使内容被丢弃 36
有时这样做时，大多数 C 实现会打开文件，读取
并全部扫描 37 次。事实上，如果没有高超的聪明才智，
潜在复杂的宏语义需要行为
C 预处理器。

从那时起，编译器变得相当聪明： https://gcc.gnu.org/onlinedocs/ cppinternals/Guard-Macros.html，所以现在这不再是一个问题。

Google 构建的单个 C++ 二进制文件可以打开和读取
数百个单独的头文件数万次。在
2007 年，Google 的构建工程师对一个
主要的谷歌二进制文件。该文件包含大约两千个文件，
如果简单地连接在一起，总计 4.2 兆字节。到......的时候
#includes 已扩展，交付量超过 8 GB
到编译器的输入，每个 C++ 增加 2000 字节
源字节。
另一个数据点是，2003 年 Google 的构建系统从
单个 Makefile 到每个目录的设计，具有更好的管理、更多的功能
显式依赖关系。典型的二进制文件大小缩小了约 40%，
只是因为记录了更准确的依赖关系。即便如此，
C++（或 C）的特性使得验证变得不切实际
这些依赖项是自动的，而今天我们仍然没有
准确理解大型Google的依赖需求
C++ 二进制文件。

关于二进制大小的观点仍然相关。编译器（链接器）对于剥离未使用的符号非常保守。如何使用 GCC 和 ld 删除未使用的 C/C++ 符号？

在计划 9 中，头文件被禁止包含更多内容
#include 子句；所有#includes 都必须位于顶级C 文件中。当然，这需要一些纪律——程序员是
需要在中列出一次必要的依赖项
正确的顺序——但是文档很有帮助，而且在实践中它非常有效
好吧。

这是一个可能的解决方案。另一种可能性是拥有一个为您管理包含的工具，例如 MakeDeps。

还有统一构建，有时称为 SCU，单一编译单元构建。有一些工具可以帮助管理它，例如 https://github.com/sakra/cotire

使用构建针对增量编译速度进行优化的系统也可能是有利的。我说的是 Google 的 Bazel 和类似的。但是，它不能保护您免受大量其他文件中包含的头文件的更改。

最后，有一项关于 C++ 模块的提案正在进行中，很棒的东西 https://groups.google.com/a/isocpp.org/forum/#!forum/modules。另请参阅C++ 模块到底是什么？

Pike wrote some more about it in https://talks.golang.org/2012/splash.article:

In 1984, a compilation of ps.c, the source to the Unix ps command, was
observed to #include <sys/stat.h> 37 times by the time all the
preprocessing had been done. Even though the contents are discarded 36
times while doing so, most C implementations would open the file, read
it, and scan it all 37 times. Without great cleverness, in fact, that
behavior is required by the potentially complex macro semantics of the
C preprocessor.

Compilers have become quite clever since: https://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html, so this is less of an issue now.

The construction of a single C++ binary at Google can open and read
hundreds of individual header files tens of thousands of times. In
2007, build engineers at Google instrumented the compilation of a
major Google binary. The file contained about two thousand files that,
if simply concatenated together, totaled 4.2 megabytes. By the time
the #includes had been expanded, over 8 gigabytes were being delivered
to the input of the compiler, a blow-up of 2000 bytes for every C++
source byte.
As another data point, in 2003 Google's build system was moved from a
single Makefile to a per-directory design with better-managed, more
explicit dependencies. A typical binary shrank about 40% in file size,
just from having more accurate dependencies recorded. Even so, the
properties of C++ (or C for that matter) make it impractical to verify
those dependencies automatically, and today we still do not have an
accurate understanding of the dependency requirements of large Google
C++ binaries.

The point about binary sizes is still relevant. Compilers (linkers) are quite conservative regarding stripping unused symbols. How to remove unused C/C++ symbols with GCC and ld?

In Plan 9, header files were forbidden from containing further
#include clauses; all #includes were required to be in the top-level C file. This required some discipline, of course—the programmer was
required to list the necessary dependencies exactly once, in the
correct order—but documentation helped and in practice it worked very
well.

This is a possible solution. Another possiblity is to have a tool that manages the includes for you, for example MakeDeps.

There is also unity builds, sometimes called SCU, single compilation unit builds. There are tools to help manage that, like https://github.com/sakra/cotire

Using a build system that optimizes for the speed of incremental compilation can be advantageous too. I am talking about Google's Bazel and similar. It does not protect you from a change in a header file that is included in a large number of other files, though.

Finally, there is a proposal for C++ modules in the works, great stuff https://groups.google.com/a/isocpp.org/forum/#!forum/modules. See also What exactly are C++ modules?

回复收藏 0 原文

若无相欠,怎会相见 2024-11-02 09:02:12

您当前执行的方式是常见的方式。 Pike 的方法减少了一点编译时间，但对于现代编译器来说可能不会减少很多（当 Pike 写他的笔记时，编译器不受优化器限制），它使模块变得混乱并且容易出现错误。

您仍然可以通过不包含标头中的标头来减少多重包含，而是使用“在包含此标头之前包含 ”来记录它们。

回复收藏 0 原文

抠脚大汉 2024-11-02 09:02:12

我建议您将它们放在源文件本身中。无需抱怨实际 PC 上数千行不必要的解析代码。

此外，如果您检查包含标头的每个源文件中的每个标头，则需要更多的工作和源代码。

并且您必须处理与默认头文件和其他第三方头文件不同的头文件。

回复收藏 0 原文

淡紫姑娘！ 2024-11-02 09:02:12

他在写这篇文章时可能已经发生了争执。如今，优秀的编译器足够聪明，可以很好地处理这个问题。

回复收藏 0 原文

放我走吧 2024-11-02 09:02:12

我同意你的方法 - 正如其他人评论的那样，它更清晰、自记录且维护成本更低。

我对 Rob Pike 为何提出他的方法的理论是：他谈论的是 C，而不是 C++。

在 C++ 中，如果您有很多类并且在其自己的头文件中声明每个类，那么您将拥有很多头文件。 C 并没有真正提供这种细粒度的结构（我不记得见过很多单结构 C 头文件），并且 .h/.c 文件对往往更大并且包含诸如模块或子系统之类的东西。因此，头文件更少。在这种情况下，罗布·派克的方法可能会奏效。但我认为它不适合重要的 C++ 程序。

回复收藏 0 原文

~没有更多了~