一个头文件中有多个类与每个类有一个头文件
无论出于何种原因,我们公司都有一个编码指南,规定:
每个类都应有自己的头文件和实现文件。
因此,如果我们编写一个名为 MyString
的类,我们将需要一个关联的 MyStringh.h 和 MyString.cxx。
还有其他人这样做吗? 有没有人看到任何编译性能的影响? 10000 个文件中的 5000 个类的编译速度是否与 2500 个文件中的 5000 个类的编译速度一样快? 如果不是,差异明显吗?
[我们编写 C++ 代码并使用 GCC 3.4.4 作为我们的日常编译器]
For whatever reason, our company has a coding guideline that states:
Each class shall have it's own header and implementation file.
So if we wrote a class called MyString
we would need an associated MyStringh.h and MyString.cxx.
Does anyone else do this? Has anyone seen any compiling performance repercussions as a result? Does 5000 classes in 10000 files compile just as quickly as 5000 classes in 2500 files? If not, is the difference noticeable?
[We code C++ and use GCC 3.4.4 as our everyday compiler]
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(13)
我们在工作中这样做,如果类和文件具有相同的名称,则更容易找到内容。 至于性能,您确实不应该在单个项目中拥有 5000 个类。 如果这样做,可能需要进行一些重构。
也就是说,在某些情况下,我们在一个文件中有多个类。 那就是当它只是文件主类的私有帮助器类时。
We do that at work, its just easier to find stuff if the class and files have the same name. As for performance, you really shouldn't have 5000 classes in a single project. If you do, some refactoring might be in order.
That said, there are instances when we have multiple classes in one file. And that is when it's just a private helper class for the main class of the file.
我工作过的大多数地方都遵循这种做法。 实际上,我已经为 BAE(澳大利亚)编写了编码标准,以及为什么不只是在没有真正理由的情况下将某些东西刻在石头上的原因。
关于您有关源文件的问题,与其说需要花费大量时间进行编译,不如说更重要的是能够首先找到相关代码片段的问题。 并不是每个人都在使用 IDE。 与在一堆文件上运行“grep MyClass *.(h|cpp)”然后过滤掉 #include MyClass.h 语句相比,知道您只查找 MyClass.h 和 MyClass.cpp 确实节省了时间......
请注意,对于大量源文件对编译时间的影响,有一些解决方法。 请参阅 John Lakos 的《大规模 C++ 软件设计》进行有趣的讨论。
您可能还想阅读 Steve McConnell 的《Code Complete》,其中有关编码指南的精彩章节。 事实上,这本书是一本很棒的书,我会经常回来阅读。
注意:您需要 Code Complete 的第一版,该版本可以轻松在线获取。 关于编码和命名指南的有趣部分并未出现在 Code Complete 2 中。
Most places where I have worked have followed this practice. I've actually written coding standards for BAE (Aust.) along with the reasons why instead of just carving something in stone with no real justification.
Concerning your question about source files, it's not so much time to compile but more an issue of being able to find the relevant code snippet in the first place. Not everyone is using an IDE. And knowing that you just look for MyClass.h and MyClass.cpp really saves time compared to running "grep MyClass *.(h|cpp)" over a bunch of files and then filtering out the #include MyClass.h statements...
Mind you there are work-arounds for the impact of large numbers of source files on compile times. See Large Scale C++ Software Design by John Lakos for an interesting discussion.
You might also like to read Code Complete by Steve McConnell for an excellent chapter on coding guidelines. Actualy, this book is a great read that I keep coming back to regularly.
N.B. You need the first edition of Code Complete that is easily available online for a copy. The interesting section on coding and naming guidelines didn't make it into Code Complete 2.
除了简单地“更清晰”之外,将类分离到单独的文件中还可以使多个开发人员更容易避免互相打扰。 当需要向版本控制工具提交更改时,合并将会减少。
In addition to simply being "clearer", separating classes into separate files makes it easier for multiple developers not to step on each others toes. There will be less merging when it comes time to commit changes to your version control tool.
分离+1。 我刚刚进入一个项目,其中某些类位于具有不同名称的文件中,或者与另一个类混在一起,并且不可能快速有效地找到这些类。 您可以在构建中投入更多资源 - 您无法弥补程序员损失的时间,因为他找不到合适的文件进行编辑。
+1 for separation. I just came onto a project where some classes are in files with a different name, or lumped in with another class, and it is impossible to find these in a quick and efficient manner. You can throw more resources at a build - you can't make up lost programmer time because (s)he can't find the right file to edit.
我发现这些准则在涉及头文件时特别有用:
http://google-styleguide.googlecode.com/svn/trunk/cppguide .xml#Header_Files
I found these guidelines particularly useful when it comes to header files :
http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml#Header_Files
同样的规则也适用于此,但它指出了一些允许的例外情况,如下所示:
The same rule applies here, but it notes a few exceptions where it is allowed Like so:
这样做是常见的做法,特别是能够在需要它的文件中包含 .h。 当然,性能会受到影响,但尽量不要考虑这个问题,直到它出现:)。
最好从分开的文件开始,然后尝试合并常用的 .h 文件,以提高性能(如果确实需要)。 这一切都取决于文件之间的依赖关系,这对于每个项目来说都是非常特定的。
It's common practice to do this, especially to be able to include .h in the files that need it. Of course the performance is affected but try not to think about this problem until it arises :).
It's better to start with the files separated and after that try to merge the .h's that are commonly used together to improve performance if you really need to. It all comes down to dependencies between files and this is very specific to each project.
正如其他人所说,最佳实践是从代码维护和可理解性的角度将每个类放置在自己的翻译单元中。 然而,在大型系统上,有时不建议这样做 - 请参阅 Bruce Dawson 的这篇文章讨论了权衡。
The best practice, as others have said, is to place each class in its own translation unit from a code maintenance and understandability perspective. However on large scale systems this is sometimes not advisable - see the section entitled "Make Those Source Files Bigger" in this article by Bruce Dawson for a discussion of the tradeoffs.
两个词:奥卡姆剃刀。 每个文件保留一个类,并将相应的标头放在单独的文件中。 如果您这样做,例如为每个文件保留一项功能,那么您必须创建有关功能的构成的各种规则。 每个文件保留一个类可以获得更多好处。 而且,即使是半像样的工具也可以处理大量文件。 保持简单,我的朋友。
Two words: Ockham's Razor. Keep one class per file with the corresponding header in a separate file. If you do otherwise, like keeping a piece of functionality per file, then you have to create all kinds of rules about what constitutes a piece of functionality. There is much more to gain by keeping one class per file. And, even half decent tools can handle large quantities of files. Keep it simple my friend.
令我惊讶的是,几乎每个人都赞成每个班级有一个文件。 问题是,在“重构”时代,人们可能很难保持文件名和类名同步。 每次更改类名时,您也必须更改文件名,这意味着您还必须在包含文件的所有位置进行更改。
我个人将相关的类分组到一个文件中,然后给这样的文件一个有意义的名称,即使类名发生变化,该名称也不必更改。 文件越少,滚动文件树就越容易。
我在 Windows 上使用 Visual Studio,在 Linux 上使用 Eclipse CDT,两者都有快捷键可以直接进入类声明,因此查找类声明既简单又快捷。
话虽如此,我认为一旦一个项目完成,或者它的结构已经“固化”,并且名称更改变得很少,那么每个文件有一个类可能是有意义的。 我希望有一个工具可以提取类并将它们放入不同的 .h 和 .cpp 文件中。 但我认为这不是必要的。
选择还取决于一个人所从事的项目类型。 在我看来,这个问题不值得一个非黑即白的答案,因为任何一个选择都有优点和缺点。
I'm surprised that almost everyone is in favor of having one file per class. The problem with that is that in the age of 'refactoring' one may have a hard time keeping the file and class names in synch. Everytime you change a class name, you then have to change the file name too, which means that you have to also make a change everywhere the file is included.
I personally group related classes into a single files and then give such a file a meaningful name that won't have to change even if a class name changes. Having fewer files also makes scrolling through a file tree easier.
I use Visual Studio on Windows and Eclipse CDT on Linux, and both have shortcut keys that take you straight to a class declaration, so finding a class declaration is easy and quick.
Having said that, I think once a project is completed, or its structure has 'solidified', and name changes become rare, it may make sense to have one class per file. I wish there was a tool that could extract classes and place them in distinct .h and .cpp files. But I don't see this as essential.
The choice also depends on the type of project one works on. In my opinion the issue doesn't deserve a black and white answer since either choice has pros and cons.
每个文件只有一个类是非常有帮助的,但是如果您通过包含所有单独 C++ 文件的批量构建文件进行构建,则可以加快编译速度,因为对于许多编译器来说启动时间相对较长。
It is very helpful to have only have one class per file, but if you do your building via bulkbuild files which include all the individual C++ files, it makes for faster compilations since startup time is relatively large for many compilers.
这里的术语是翻译单元,您确实希望(如果可能的话)每个翻译单元有一个类,即每个 .cpp 文件有一个类实现,并具有相应的同名 .h 文件。
从编译/链接的角度来看,以这种方式执行操作通常更有效,特别是当您执行增量链接等操作时。 这个想法是,翻译单元是隔离的,这样,当一个翻译单元发生变化时,您不必重建很多东西,而如果您开始将许多抽象概念集中到一个翻译单元中,则必须重建很多东西。
此外,您还会发现许多错误/诊断是通过文件名报告的(“Myclass.cpp 中的错误,第 22 行”),如果文件和类之间存在一对一的对应关系,则会有所帮助。 (或者我想你可以称之为 2 对 1 对应)。
The term here is translation unit and you really want to (if possible) have one class per translation unit ie, one class implementation per .cpp file, with a corresponding .h file of the same name.
It's usually more efficient (from a compile/link) standpoint to do things this way, especially if you're doing things like incremental link and so forth. The idea being, translation units are isolated such that, when one translation unit changes, you don't have to rebuild a lot of stuff, as you would have to if you started lumping many abstractions into a single translation unit.
Also you'll find many errors/diagnostics are reported via file name ("Error in Myclass.cpp, line 22") and it helps if there's a one-to-one correspondence between files and classes. (Or I suppose you could call it a 2 to 1 correspondence).
被数千行代码淹没?
在一个目录中为每个类提供一组头文件/源文件似乎有点过分了。 而如果班级数量达到100、1000,那就更让人害怕了。
但遵循“让我们把所有东西放在一起”的理念来研究资料来源后,得出的结论是,只有编写该文件的人才有希望不迷失其中。 即使使用 IDE,也很容易错过一些东西,因为当您使用 20,000 行的源代码时,您只会对任何与您的问题不完全相关的内容置之不理。
现实生活例如:在这千行源代码中定义的类层次结构将自身封闭为菱形继承,并且子类中的某些方法被具有完全相同代码的方法覆盖。 这很容易被忽视(谁愿意探索/检查 20,000 行源代码?),并且当原始方法被更改(错误修正)时,效果并不像例外的那样普遍。
依赖关系变得循环?
我在模板化代码中遇到了这个问题,但我在常规 C++ 和 C 代码中也看到了类似的问题。
将源代码分解为每个结构/类 1 个标头可以让您:
在源代码控制的代码中,类依赖关系可能会导致类在文件中定期上下移动,只是为了使标头进行编译。 在比较不同版本中的同一文件时,您不想研究此类动作的演变。
拥有单独的标头使代码更加模块化,编译速度更快,并且更容易通过不同版本差异研究其演变
对于我的模板程序,我必须将标头分成两个文件:包含模板类声明/定义的 .HPP 文件,以及包含所述类方法的定义的 .INL 文件。
将所有这些代码放入一个且仅有一个唯一的标头中意味着将类定义放在这个文件,以及最后的方法定义。
然后,如果有人只需要一小部分代码,使用仅一个标头的解决方案,他们仍然需要为较慢的编译付出代价。
(§) 请注意,您可以使用循环如果您知道哪个类拥有哪个类,则类之间的依赖关系。 这是关于了解其他类的存在的类的讨论,而不是共享指针循环依赖反模式。
最后一句话:标头应该是自给自足的
但是,多个标头和多个源的解决方案必须尊重这一点。
当您包含一个标头时,无论是哪个标头,您的源代码都必须干净地编译。
每个标头都应该是自给自足的。 您应该开发代码,而不是通过 grep 10,000 多个源文件项目来查找哪个标头定义了您需要包含的 1,000 行标头中的符号,这只是因为 one 枚举。
这意味着每个标头要么定义或前向声明它使用的所有符号,要么包含所有所需的标头(并且仅包含所需的标头)。
关于循环依赖的问题
underscore-d 询问:
还很远。假设你有 2 个类模板,A 和 B。
假设类 A 的定义(分别是B) 有一个指向 B(或 A)的指针。 我们还假设类 A(或 B)的方法实际上调用 B(或 A)的方法。
在类的定义及其方法的实现中都存在循环依赖。
如果 A 和 B 是普通类,并且 A 和 B 的方法位于 .CPP 文件中,则不会有问题:您将使用前向声明,为每个类定义提供一个标头,然后每个 CPP 将包含两个 HPP。
但由于您有模板,您实际上必须重现上面的模式,但仅包含标题。
这意味着:
A.def.hpp
和B.def.hpp
A.inl.hpp
和B.inl.hpp
A.hpp
和B.hpp
每个标头都将具有以下特征:
中A.def.hpp
(分别为B.def.hpp
),您有一个类B(分别为A)的前向声明,这将使您能够声明一个指针/引用该类A.inl.hpp
(分别为B.inl.hpp
)将包含A.def.hpp
和B.def.hpp
,这将使 A(或 B)中的方法能够使用类 B(或 A)。A.hpp
(分别为B.hpp
)将直接包含A.def.hpp
和A.inl.hpp< /code> (分别为
B.def.hpp
和B.inl.hpp
)装置 的保护用户将包含
A.hpp
和/或B.hpp
,从而忽略整个混乱。拥有这种组织意味着库编写者可以解决 A 和 B 之间的循环依赖关系,同时将两个类保留在单独的文件中,一旦您理解了该方案,就可以轻松导航。
请注意,这是一种边缘情况(两个模板彼此认识)。 我预计大多数代码不需要需要这个技巧。
Overwhelmed by thousands lines of code?
Having one set of header/source files per class in a directory can seem overkill. And if the number of classes goes toward 100 or 1000, it can even be frightening.
But having played with sources following the philosophy "let's put together everything", the conclusion is that only the one who wrote the file has any hope to not be lost inside. Even with an IDE, it is easy to miss things because when you're playing with a source of 20,000 lines, you just close your mind for anything not exactly referring to your problem.
Real life example: the class hierarchy defined in those thousand lines sources closed itself into a diamond-inheritance, and some methods were overridden in child classes by methods with exactly the same code. This was easily overlooked (who wants to explore/check a 20,000 lines source code?), and when the original method was changed (bug correction), the effect was not as universal as excepted.
Dependancies becoming circular?
I had this problem with templated code, but I saw similar problems with regular C++ and C code.
Breaking down your sources into 1 header per struct/class lets you:
In source-controlled code, class dependencies could lead to regular moving of classes up and down the file, just to make the header compile. You don't want to study the evolution of such moves when comparing the same file in different versions.
Having separate headers makes the code more modular, faster to compile, and makes it easier to study its evolution through different versions diffs
For my template program, I had to divide my headers into two files: The .HPP file containing the template class declaration/definition, and the .INL file containing the definitions of the said class methods.
Putting all this code inside one and only one unique header would mean putting class definitions at the beginning of this file, and the method definitions at the end.
And then, if someone needed only a small part of the code, with the one-header-only solution, they still would have to pay for the slower compilation.
(§) Note that you can have circular dependencies between classes if you know which class owns which. This is a discussion about classes having knowledge of the existence of other classes, not shared_ptr circular dependencies antipattern.
One last word: Headers should be self-sufficients
One thing, though, that must be respected by a solution of multiple headers and multiple sources.
When you include one header, no matter which header, your source must compile cleanly.
Each header should be self-sufficient. You're supposed to develop code, not treasure-hunting by greping your 10,000+ source files project to find which header defines the symbol in the 1,000 lines header you need to include just because of one enum.
This means that either each header defines or forward-declare all the symbols it uses, or include all the needed headers (and only the needed headers).
Question about circular dependencies
underscore-d asks:
Let's say you have 2 class templates, A and B.
Let's say the definition of class A (resp. B) has a pointer to B (resp. A). Let's also say the methods of class A (resp. B) actually call methods from B (resp. A).
You have a circular dependency both in the definition of the classes, and the implementations of their methods.
If A and B were normal classes, and A and B's methods were in .CPP files, there would be no problem: You would use a forward declaration, have a header for each class definitions, then each CPP would include both HPP.
But as you have templates, you actually have to reproduce that patterns above, but with headers only.
This means:
A.def.hpp
andB.def.hpp
A.inl.hpp
andB.inl.hpp
A.hpp
andB.hpp
Each header will have the following traits:
A.def.hpp
(resp.B.def.hpp
), you have a forward declaration of class B (resp. A), which will enable you to declare a pointer/reference to that classA.inl.hpp
(resp.B.inl.hpp
) will include bothA.def.hpp
andB.def.hpp
, which will enable methods from A (resp. B) to use the class B (resp. A).A.hpp
(resp.B.hpp
) will directly include bothA.def.hpp
andA.inl.hpp
(resp.B.def.hpp
andB.inl.hpp
)The naive user will include
A.hpp
and/orB.hpp
, thus ignoring the whole mess.And having that organization means the library writer can solve the circular dependencies between A and B while keeping both classes in separate files, easy to navigate once you understand the scheme.
Please note that it was an edge case (two templates knowing each other). I expect most code to not need that trick.