如何最好地从混乱的模板切换到干净的类架构(C++)?

发布于 2024-07-09 08:41:46 字数 346 浏览 7 评论 0原文

假设一个较大的模板库,大约有 100 个文件,其中包含大约 100 个模板,总共超过 200,000 行代码。 一些模板使用多重继承来使库本身的使用相当简单(即从一些基本模板继承并且只需实现某些业务规则)。

所有存在的(经过几年的发展)、“有效”并用于项目。

然而,使用该库编译项目会消耗越来越多的时间,并且需要相当长的时间才能找到某些错误的来源。 修复通常会导致意想不到的副作用或相当困难,因为一些相互依赖的模板需要更改。 由于功能数量庞大,测试几乎是不可能的。

现在,我真的很想简化架构以使用更少的模板和更专业的小类。

有没有行之有效的方法来完成这项任务? 什么是一个好的起点?

Assuming a largish template library with around 100 files containing around 100 templates with overall more than 200,000 lines of code. Some of the templates use multiple inheritance to make the usage of the library itself rather simple (i.e. inherit from some base templates and only having to implement certain business rules).

All that exists (grown over several years), "works" and is used for projects.

However, compilation of projects using that library consumes a growing amount of time and it takes quite some time to locate the source for certain bugs. Fixing often causes unexpected side effects or is quite difficult, because some interdependent templates need changing. Testing is nearly impossible due to the sheer amount of functions.

Now, I would really like to simplify the architecture to use less templates and more specialized smaller classes.

Is there any proven way to go about that task? What would be a good place to start?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

沩ん囻菔务 2024-07-16 08:41:46

我不确定我是否明白模板是如何/为什么会出现问题,以及为什么普通的非模板类会是一种改进。 这难道不是意味着更多的类、更少的类型安全性以及更大的潜在错误吗?

我可以理解简化架构、重构和删除各种类和模板之间的依赖关系,但自动假设“更少的模板将使架构更好”在我看来是有缺陷的。

我想说,模板可能可以让您构建比没有模板更清晰的架构。 仅仅是因为您可以使单独的类完全独立。 如果没有模板,调用另一个类的类函数必须提前了解该类或其继承的接口。 对于模板,这种耦合是不必要的。

删除模板只会导致更多的依赖,而不是更少。
模板的附加类型安全性可用于在编译时检测大量错误(为此目的,在代码中大量使用 static_assert )

当然,在某些情况下,附加的编译时可能是避免使用模板的正当理由,如果您只有一群 Java 程序员,他们习惯于用“传统”OOP 术语进行思考,那么模板可能会让他们感到困惑,这可能是避免使用模板的另一个正当理由。

但从架构的角度来看,我认为避免模板是朝着错误方向迈出的一步。

当然,重构应用程序,这听起来是必要的。 但是,不要仅仅因为应用程序的原始版本滥用了用于生成可扩展且健壮的代码的最有用的工具之一,就抛弃了它。 特别是如果您已经担心代码量,删除模板很可能会导致更多行代码。

I'm not sure I see how/why templates are the problem, and why plain non-templated classes would be an improvement. Wouldn't that just mean even more classes, less type safety and so larger potential for bugs?

I can understand simplifying the architecture, refactoring and removing dependencies between the various classes and templates, but automatically assuming that "fewer templates will make the architecture better" is flawed imo.

I'd say that templates potentially allow you to build a much cleaner architecture than you'd get without them. Simply because you can make separate classes totally independent. Without templates, classes functions which call into another class must know about the class, or an interface it inherits, in advance. With templates, this coupling isn't necessary.

Removing templates would only lead to more dependencies, not fewer.
The added type-safety of templates can be used to detect a lot of bugs at compile-time (Sprinkle your code liberally with static_assert's for this purpose)

Of course, the added compile-time may be a valid reason to avoid templates in some cases, and if you only have a bunch of Java programmers, who are used to thinking in "traditional" OOP terms, templates might confuse them, which can be another valid reason to avoid templates.

But from an architecture point of view, I think avoiding templates is a step in the wrong direction.

Refactor the application, sure, it sounds like that's needed. But don't throw away one of the most useful tools for producing extensible and robust code just because the original version of the app misused it. Especially if you're already concerned with the amount of code, removing templates will most likely lead to more lines of code.

§普罗旺斯的薰衣草 2024-07-16 08:41:46

您需要自动化测试,这样十年后,当您的继任者遇到同样的问题时,他可以重构代码(可能添加更多模板,因为他认为这会简化库的使用)并知道它仍然满足所有测试用例。 同样,任何小错误修复的副作用将立即可见(假设您的测试用例很好)。

除此之外,“分而治之”

You need automated tests, that way in ten years time when your succesor has the same problem he can refactor the code (probably to add more templates because he thinks it will simplify usage of the library) and know it still meets all test cases. Similarly the side effects of any minor bug fixes will be immediately visible (assuming your test cases are good).

Other than that, "divide and conqueor"

趁年轻赶紧闹 2024-07-16 08:41:46

问题在于模板的思维方式与面向对象的基于继承的方式有很大不同。 除了“重新设计整个事情并从头开始”之外,很难回答任何其他问题。

当然,针对特定情况可能有一种简单的方法。 如果不了解您拥有的更多信息,我们无法判断。

模板解决方案如此难以维护这一事实无论如何都表明设计很糟糕。

Well, the problem is that template way of thinking is very different from object-oriented inheritance-based way. It's hard to answer anything else than "redesign the whole thing and start from scratch".

Of course, there may be a simple way for a particular case. We can't tell without knowing more about what you have.

The fact that the template solution is so difficult to maintain is an indication of a poor design anyway.

红颜悴 2024-07-16 08:41:46

一些要点(但请注意:这些确实不是邪恶的。不过,如果您想更改为非模板代码,这会有所帮助):


查找您的静态接口。 模板在哪里依赖于哪些函数存在? 他们在哪里需要 typedef?

将公共部分放在抽象基类中。 一个很好的例子是当您偶然发现 CRTP 习惯用法时。 您可以将其替换为具有虚函数的抽象基类。

查找整数列表。 如果你发现你的代码使用了list<1, 3, 3, 1, 3>这样的整数列表,你可以将它们替换为std::vector,如果所有代码使用它们可以适应运行时值而不是常量表达式。

查找类型特征。 有很多代码涉及检查某些 typedef 是否存在,或者某些方法是否存在于典型的模板化代码中。 抽象基类通过使用纯虚方法以及将 typedef 继承到基类来解决这两个问题。 通常,typedef 仅需要触发诸如 SFINAE 之类的可怕功能,这也是多余的。

查找表达式模板。 如果您的代码使用表达式模板来避免创建临时变量,则必须消除它们并使用传统方式将临时变量返回/传递给所涉及的运算符。

查找函数对象。 如果您发现您的代码使用函数对象,您也可以将它们更改为使用抽象基类,并使用诸如 void run(); 之类的东西来调用它们(或者如果您想继续使用 operator(),更好的是它也可以是虚拟的)。

Some points (but note: these are not evil indeed. If you want to change to non-template code, though, this can help out):


Lookup your static interfaces. Where do templates depend on what functions exist? Where do they need typedefs?

Put the common parts in an abstract base class. A good example is when you happen to stumble over the CRTP idiom. You can just replace it with an abstract base class having virtual functions.

Lookup integer lists. If you find your code uses integral lists like list<1, 3, 3, 1, 3>, you can replace them with std::vector, if all the codes using them can live with working with runtime values instead of constant expressions.

Lookup type traits. There is much code involved checking whether some typedef exists, or whether some method exists in typical templated code. Abstract baseclasses solve these two issues by using pure virtual methods, and by inheriting typedefs to the base. Often, typedefs are only needed to trigger hideous features like SFINAE, which would then be superfluous too.

Lookup expression templates. If your code uses expression templates to avoid creating temporaries, you will have to eliminate them and use the traditional way of returning / passing temporaries to the operators involved.

Lookup function objects. If you find your code uses function objects, you can change them to use abstract base classes too, and have something like void run(); to call them (or if you want to keep using operator(), better so! It can be virtual too).

旧夏天 2024-07-16 08:41:46

编写单元测试。

新代码必须与旧代码执行相同的操作。

这至少是一个提示。

编辑:

如果您弃用已替换为新功能的旧代码,则
可以逐步过渡到新代码。

Write unit tests.

Where the new code must do the same as the old code.

That's one tip at least.

Edit:

If you deprecate old code that you have replaced with the new functionality you
can phase over to the new code little by little.

别再吹冷风 2024-07-16 08:41:46

据我了解,您最关心的是构建时间和库的可维护性?

首先,不要试图一次性“修复”所有问题。

其次,了解您修复的内容。 模板复杂性的存在通常是有原因的,例如强制执行某些用途,并使编译器帮助您避免犯错误。 这个理由有时可能有些过分,但因为“没有人真正知道他们在做什么”而扔掉 100 行,不应该掉以轻心。 我已经警告过您,我在这里建议的所有内容都可能会引入非常讨厌的错误。

第三,首先考虑更便宜的修复:例如更快的机器或分布式构建工具。 至少,投入主板将占用的所有 RAM,并扔掉旧磁盘。 它确实有所作为。 一个驱动器用于操作系统,一个驱动器用于构建,这是一种廉价的 RAID。

图书馆的文档是否齐全? 这是您实现这一目标的最佳机会 查看 doxygen 等工具来帮助您创建此类文档。

都考虑了吗? 好的,现在对构建时间有一些建议;)


了解 C++ 构建模型:每个.cpp都是单独编译的。 这意味着许多带有许多标头的 .cpp 文件 = 巨大的构建。 不过,这并不是建议将所有内容放入一个 .cpp 文件中! 然而,可以极大地加快构建速度的一个技巧(!)是创建一个包含一堆 .cpp 文件的单个 .cpp 文件,并且只将该“主”文件提供给编译器。 不过,您不能盲目地这样做 - 您需要了解这可能引入的错误类型。

如果您还没有,请获取一台可以远程访问的单独的构建机器。 您必须进行大量几乎完整的构建才能检查是否破坏了某些包含。 您将希望在另一台机器上运行它,这不会阻止您处理其他事情。 从长远来看,无论如何,您都需要它来进行日常集成构建;)

使用预编译标头。 (使用快速机器可以更好地扩展,请参见上文)

检查您的标头包含策略。 虽然每个文件都应该是“独立的”(即包含其他人需要包含的所有内容),但不要随意包含。 不幸的是,我还没有找到一个工具来查找不必要的 #incldue 语句,但花一些时间删除“热点”文件中未使用的标头可能会有所帮助。

为您使用的模板创建并使用前向声明。 通常,您可以在许多地方包含带有前向声明的标头,并仅在少数特定位置使用完整标头。 这可以极大地帮助编译时间。 检查 标头,标准库如何对 I/O 流执行此操作。

少数类型模板的重载:如果您有一个复杂的函数模板,仅对极少数类型有用,如下所示:

// .h
template <typename FLOAT> // float or double only
FLOAT CalcIt(int len, FLOAT * values) { ... }

您可以在标头中声明重载,并将模板移至正文:

// .h
float CalcIt(int len, float * values);
double CalcIt(int len, double * values);

// .cpp
template <typename FLOAT> // float or double only
FLOAT CalcItT(int len, FLOAT * values) { ... }

float CalcIt(int len, float * values) { return CalcItT(len, values); }
double CalcIt(int len, double * values) { return CalcItT(len, values); }

这会将冗长的模板移动到单个编译单元。
不幸的是,这对于类来说用途有限。

检查PIMPL idiom是否可以从将标头放入 .cpp 文件中。

隐藏在其背后的一般规则是将库的接口与实现分开。 使用注释、detail 命名空间和单独的 .impl.h 标头在精神上和物理上将外部应该知道的内容与其实现方式隔离开来。 这暴露了你的库的真正价值(它实际上封装了复杂性吗?),并让你有机会首先替换“简单目标”。


More specific advise - and how useful the one given is - depends largely on the actual library.

祝你好运!

As I understand, you are most concerned with build times, and the maintainability of your library?

First, don't try to "fix" all at once.

Second, understand what you fix. Template complexity is there often for a reason, e.g. to enforce certain use, and make the compiler help you not make a mistake. That reason might sometimes be taken to far, but throwing out 100 lines because "noone really knows what they do" shouldn't be taken lightly. Everything I suggest here can introduce really nasty bugs, you have been warned.

Third, consider cheaper fixes first: e.g. faster machines or distributed build tools. At least, throw in all the RAM the boards will take, and throw out old disks. It does maike a difference. One drive for OS, one drive for build is a cheap mans RAID.

Is the library well documented? That's your best chance at making it Look into tools such as doxygen that help you create such a documentation.

All considered? OK, now some suggestions for the build times ;)


Understand the C++ build model: every .cpp is compiled individually. That means many .cpp files with many headers = huge build. This is NOT an advise to put everything into one .cpp file, though! However, one trick (!) that can speed up a build immensely is to create a single .cpp file that includes a bunch of .cpp files, and only feed that "master" file to the compiler. You can't do that blindly, though - you need to understand the types of errors this could introduce.

If you don't have one yet, get a separate build machine that you can remote into. You'll have to do a lot of almost-full builds to check if you broke some include. You will want to run this in another machine, that doesn't block you from working on something else. Long term, you'll need it for daily integration builds anyway ;)

Use precompiled headers. (scales better with fast machines, see above)

Check your header inclusion policy. While every file should be "independent" (i.e. include everything it needs to be included by someone else), don't include liberally. Unfortunately, I haven't yet found a tool to find unnecessary #incldue statements, but it might help to spend some time removing unused headers in "hotspot" files.

Create and use forward declarations for the templates you use. Often, you can incldue a header with forwad declarations in many places, and use the full header only in a few specific ones. This can greatly help compile time. Check the <iosfwd> header how the standard library does that for i/o streams.

overloads for templates for few types: If you have a complex function template that is useful only for a very few types like this:

// .h
template <typename FLOAT> // float or double only
FLOAT CalcIt(int len, FLOAT * values) { ... }

You can declare the overloads in the header, and move the template to the body:

// .h
float CalcIt(int len, float * values);
double CalcIt(int len, double * values);

// .cpp
template <typename FLOAT> // float or double only
FLOAT CalcItT(int len, FLOAT * values) { ... }

float CalcIt(int len, float * values) { return CalcItT(len, values); }
double CalcIt(int len, double * values) { return CalcItT(len, values); }

this moves the lengthy template to a single compilation unit.
Unfortunately, this is only of limited use for classes.

Check if the PIMPL idiom can move code from the headers into .cpp files.

The general rule that hides behind that is separate the interface of your library from the implementation. Use comments, detail namesapces and separate .impl.h headers to mentally and physically isolate what should be known to the outside from how it is accomplished. This exposes the real value of your library (does it actually encapsulate complexity?), and gives you a chance to replace "easy targets" first.


More specific advise - and how useful the one given is - depends largely on the actual library.

Good luck!

帥小哥 2024-07-16 08:41:46

如前所述,单元测试是一个好主意。 事实上,与其通过引入可能产生连锁反应的“简单”更改来破坏代码,不如专注于创建一套测试,并修复不符合测试的情况。 当错误被发现时,开展更新测试的活动。

除此之外,如果可能的话,我建议升级您的工具,以帮助调试与模板相关的问题。

As mentioned, unit tests are a good idea. Indeed, rather than breaking your code by introducing "simple" changes that are likely to ripple out, just focus on creating a suite of tests, and fixing non-compliance with the tests. Have an activity to update the tests when bugs come to light.

Beyond that, I would suggest upgrading your tools, if possible, to help with debugging template-related problems.

单身狗的梦 2024-07-16 08:41:46

我经常遇到很大的遗留模板,需要大量的时间和内存来实例化,但其实并不需要如此。 在这些情况下,最简单的消除冗余的方法是获取所有不依赖任何模板参数的代码,并将其隐藏在普通翻译单元中定义的单独函数中。 当必须稍微修改此代码或更改文档时,这还有一个积极的副作用,即触发更少的重新编译。 这听起来相当明显,但令人惊讶的是,人们经常编写类模板,并认为它所做的一切都必须在标头中定义,而不仅仅是需要模板化信息的代码。

您可能需要考虑的另一件事是,通过使模板成为“mixin”样式而不是多重继承的聚合来清理继承层次结构的频率。 看看有多少地方可以通过将模板参数之一设为它应该派生的基类的名称(boost::enable_shared_from_this 的工作方式)来避免这种情况。 当然,这通常只有在构造函数不带参数的情况下才有效,因为您不必担心正确初始化任何内容。

I've often come across legacy templates that were huge and required a lot of time and memory to instantiate, but didn't need to be. In those cases, the easiest way to cut out the fat was to take all of the code that didn't rely on any of the template arguments and hide it in separate functions defined in a normal translation unit. This also had the positive side-effect of triggering fewer recompiles when this code had to be slightly modified or documentation changed. It sounds rather obvious, but it's really surprising how often people write a class template and think that EVERYTHING it does has to be defined in the header, rather than just the code that needs the templated information.

Another thing you might want to consider is how often you clean up the inheritance hierarchies by making the templates "mixin" style instead of aggregations of multiple inheritance. See how many places you can get away with making one of the template arguments the name of the base class that it should derive from (the way boost::enable_shared_from_this works). Of course this typically only works well if the constructors take no arguments, as you don't have to worry about initializing anything correctly.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文