当前位置：文江博客话题详情

C++内部代码重用：编译所有内容还是共享库/动态库？

发布于 2024-08-14 05:36:10 字数 2059 浏览 6 评论 0原文

一般问题：

对于非托管 C++，内部代码共享什么更好？

通过共享实际源代码来重用代码？或者
通过共享库/动态库（+所有头文件）来重用代码

无论是哪种：您减少重复代码（复制粘贴综合症）、代码膨胀的策略是什么？

具体示例：

以下是我们在组织中共享代码的方式：

我们通过共享实际源代码来重用代码。

我们使用 VS2008 在 Windows 上进行开发，尽管我们的项目实际上需要是跨平台的。我们有许多项目 (.vcproj) 已提交到存储库；有些可能有自己的存储库，有些可能是存储库的一部分。对于每个可交付的解决方案 (.sln)（例如，我们交付给客户的东西），它将从存储库中 svn:externals 所有必要的项目 (.vcproj) 来组装“最终”产品。

这工作得很好，但我很担心最终每个解决方案的代码大小可能会变得相当大（现在我们的总代码大小约为 75K SLOC）。

另外需要注意的一件事是我们阻止所有传递依赖。也就是说，不是实际解决方案 (.sln) 的每个项目 (.vcproj) 都不允许 svn:externals 任何其他项目甚至（如果它依赖于该项目）。这是因为您可能有 2 个项目 (.vcproj)，它们可能依赖于相同的库（即 Boost）或项目 (.vcproj)，因此当您将两个项目 svn:externals 放入单个解决方案时，svn:externals 会执行两次。因此，我们仔细记录每个项目的所有依赖项，并由创建解决方案 (.sln) 的人员来确保所有依赖项（包括传递性）都是 svn:externals 作为解决方案的一部分。

如果我们通过使用 .lib 、 .dll 来重用代码，这显然会减少每个解决方案的代码大小，并消除上面提到的适用的传递依赖（例外情况是，例如，第三个-使用 dll（如 Intel TBB 和默认 Qt）的第三方库/框架）

附录：（如果您愿意，请阅读）

共享源代码的另一个动机可能可以通过博士。图形用户界面：

最重要的是，C++ 使之变得简单的是不创建可重用的二进制文件成分;相反，C++ 做到了相对容易重用源代码。请注意，大多数主要的 C++ 库都是以源代码形式提供，未编译形式。很多时候有必要查看该来源以便从对象正确继承——并且这太容易了（并且经常必要）依赖实施原始库的详细信息你重复使用它。好像这还不错足够了，它常常很诱人（或者必要）修改原来的获取并进行私人构建图书馆。（有多少私人建筑 MFC有吗？世界永远不会知道。。 .)

也许这就是为什么当您查看像 Intel Math Kernel 库这样的库时，在它们的“lib”文件夹中，每个 Visual Studio 版本都有“vc7”、“vc8”、“vc9”。可怕的东西。

或者这个断言怎么样：

C++ 是出了名的不包容当谈到插件时。 C++ 是极其特定于平台并且编译器特定的。 C++ 标准未指定应用程序二进制文件接口（ABI），意思是C++ 来自不同编译器的库或甚至同一个版本的不同版本编译器不兼容。添加到那个事实上 C++ 没有这个概念动态加载和各个平台提供自己的解决方案（不兼容与其他人）然后你就明白了。

对于上述说法，您有何看法？像 Java 或 .NET 这样的东西会面临这类问题吗？例如，如果我从 Netbeans 生成一个 JAR 文件，如果我将其导入 IntelliJ，只要我确保两者都具有兼容的 JRE/JDK，它会工作吗？

原文

General question:

For unmanaged C++, what's better for internal code sharing?

Reuse code by sharing the actual source code? OR
Reuse code by sharing the library / dynamic library (+ all the header files)

Whichever it is: what's your strategy for reducing duplicate code (copy-paste syndrome), code bloat?

Specific example:

Here's how we share the code in my organization:

We reuse code by sharing the actual source code.

We develop on Windows using VS2008, though our project actually needs to be cross-platform. We have many projects (.vcproj) committed to the repository; some might have its own repository, some might be part of a repository. For each deliverable solution (.sln) (e.g. something that we deliver to the customer), it will svn:externals all the necessary projects (.vcproj) from the repository to assemble the "final" product.

This works fine, but I'm quite worried about eventually the code size for each solution could get quite huge (right now our total code size is about 75K SLOC).

Also one thing to note is that we prevent all transitive dependency. That is, each project (.vcproj) that is not an actual solution (.sln) is not allowed to svn:externals any other project even if it depends on it. This is because you could have 2 projects (.vcproj) that might depend on the same library (i.e. Boost) or project (.vcproj), thus when you svn:externals both projects into a single solution, svn:externals will do it twice. So we carefully document all dependencies for each project, and it's up to guy that creates the solution (.sln) to ensure all dependencies (including transitive) are svn:externals as part of the solution.

If we reuse code by using .lib , .dll instead, this would obviously reduce the code size for each solution, as well as eliminiate the transitive dependency mentioned above where applicable (exceptions are, for example, third-party library/framework that use dll like Intel TBB and the default Qt)

Addendum: (read if you wish)

Another motivation to share source code might be summed up best by Dr. GUI:

On top of that, what C++ makes easy is
not creation of reusable binary
components; rather, C++ makes it
relatively easy to reuse source code.
Note that most major C++ libraries are
shipped in source form, not compiled
form. It's all too often necessary to
look at that source in order to
inherit correctly from an object—and
it's all too easy (and often
necessary) to rely on implementation
details of the original library when
you reuse it. As if that isn't bad
enough, it's often tempting (or
necessary) to modify the original
source and do a private build of the
library. (How many private builds of
MFC are there? The world will never
know . . .)

Maybe this is why when you look at libraries like Intel Math Kernel library, in their "lib" folder, they have "vc7", "vc8", "vc9" for each of the Visual Studio version. Scary stuff.

Or how about this assertion:

C++ is notoriously non-accommodating
when it comes to plugins. C++ is
extremely platform-specific and
compiler-specific. The C++ standard
doesn't specify an Application Binary
Interface (ABI), which means that C++
libraries from different compilers or
even different versions of the same
compiler are incompatible. Add to that
the fact that C++ has no concept of
dynamic loading and each platform
provide its own solution (incompatible
with others) and you get the picture.

What's your thoughts on the above assertion? Does something like Java or .NET face these kinds of problems? e.g. if I produce a JAR file from Netbeans, will it work if I import it into IntelliJ as long as I ensure that both have compatible JRE/JDK?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

动听の歌 2024-08-21 05:36:10

人们似乎认为 C 指定了 ABI。事实并非如此，而且我也不知道有任何标准化编译语言可以做到这一点。为了回答你的主要问题，使用库当然是可行的方法 - 我无法想象做任何其他事情。

回复收藏 0 原文

年少掌心 2024-08-21 05:36:10

共享源代码的一个很好的理由是：模板是 C++ 的最佳功能之一，因为它们是一种解决静态类型刚性的优雅方法，但本质上是源代码级构造。如果您专注于二进制级接口而不是源代码级接口，那么您对模板的使用将会受到限制。

回复收藏 0 原文

地狱即天堂 2024-08-21 05:36:10

我们也这样做。如果您需要在不同平台、构建环境上使用共享代码，或者即使您需要不同的构建选项（例如静态与动态链接到 C 运行时、不同的结构打包设置等），尝试使用二进制文件可能是一个真正的问题。我

通常将项目设置为尽可能多地按需构建，即使使用 zlib 和 libpng 等第三方代码也是如此。对于那些必须单独构建的东西，例如 Boost，我通常必须为所需的各种设置组合（调试/发布、VS7.1/VS9、静态/动态）构建 4 或 8 组不同的二进制文件，并管理二进制文件以及源代码管理中的调试信息文件。

当然，如果共享代码的每个人都在同一平台上使用相同的工具和相同的选项，那么情况就不同了。

回复收藏 0 原文

离笑几人歌 2024-08-21 05:36:10

我从未将共享库视为将旧项目中的代码重用到新项目中的一种方式。我一直认为这更多的是在几乎同时开发的不同应用程序之间共享一个库，以最大限度地减少膨胀。

就复制粘贴综合症而言，如果我将其复制并粘贴到多个位置，则它需要有自己的功能。这与库是否共享无关。

当我们重用旧项目中的代码时，我们总是将其作为源代码引入。总有一些东西需要调整，调整特定于项目的版本通常比调整可能会破坏先前项目的共享版本更安全。返回并修复之前的项目是不可能的，因为 1) 它已经工作（并发货），2) 不再提供资金，3) 所需的测试硬件可能不再可用。

例如，我们有一个通信库，它有一个 API，用于通过套接字、管道等发送“消息”（带有消息 ID 的数据块）：

void Foo:Send(unsigned messageID, const void* buffer, size_t bufSize);

但在后来的项目中，我们需要优化：需要消息由内存不同部分中的多个数据块连接在一起组成，我们不能（也不想，无论如何）首先进行指针数学来创建“组装”形式的数据，并且将各个部分一起复制到统一缓冲区的过程花费了太长时间。因此我们添加了一个新的 API：

void Foo:SendMultiple(unsigned messageID, const void** buffer, size_t* bufSize);

它将缓冲区组装成消息并发送。（基类的方法分配一个临时缓冲区，将各部分复制在一起，然后调用 Foo::Send() ；子类可以使用它作为默认值或用自己的方法覆盖它，例如发送的类套接字上的消息只需为每个缓冲区调用 send()，从而消除大量副本。）

现在，通过这样做，我们可以选择向后移植（实际上是复制）对旧版本，但我们不需要向后移植。这为管理者提供了根据时间和资金限制的灵活性。

编辑：读完尼尔的评论后，我想到了我们需要澄清的一些事情。

在我们的代码中，我们做了很多“库”。很多。我写的一个大程序大约有 50 个。因为，对于我们和我们的构建设置来说，它们很简单。

我们使用一个可以动态自动生成 makefile 的工具，处理依赖关系和几乎所有事情。如果有任何奇怪的事情需要完成，我们会编写一个包含例外情况的文件，通常只有几行。

它的工作原理如下：该工具查找目录中看起来像源文件的所有内容，如果文件发生更改，则生成依赖项，并吐出所需的规则。然后它制定一条规则，将所有内容和 ar/ranlib 放入 libxxx.a 文件中，以目录命名。所有对象和库都放在以目标平台命名的子目录中（这使得交叉编译易于支持）。然后对每个子目录（目标文件子目录除外）重复此过程。然后，顶级目录与所有子目录的库链接到可执行文件中，并在顶级目录之后再次创建裸露的符号链接。

所以目录就是库。要在程序中使用库，请创建指向它的符号链接。无痛。因此，一切从一开始就被划分到库中。如果您想要共享库，请在目录名称上添加“.so”后缀。

要从另一个项目中提取库，我只需使用 Subversion 外部来获取所需的目录。符号链接是相对的，所以只要我不留下任何东西，它仍然有效。当我们发布时，我们将外部引用锁定到父级的特定修订版。

如果我们需要向库添加功能，我们可以执行以下操作之一。我们可以修改父项目（如果它仍然是一个活动项目并且因此可测试），告诉 Subversion 使用较新的版本并修复出现的任何错误。或者，如果扰乱父级的风险太大，我们可以克隆代码，替换外部链接。不管怎样，它对我们来说仍然像一个“图书馆”，但我不确定它是否符合图书馆的精神。

我们正在迁移到 Mercurial，它没有“外部”机制，因此我们必须首先克隆库，使用 rsync 来保持不同存储库之间的代码同步，或者强制使用公共目录结构，以便你可以从多个父母那里获得汞。最后一个选项似乎效果很好。

I never saw shared libraries as a way to reuse code from an old project into a new one. I always thought it was more about sharing a library between different applications that you're developing at about the same time, to minimize bloat.

As far as copy-paste syndrome goes, if I copy and paste it in more than a couple places, it needs to be its own function. That's independent of whether the library is shared or not.

When we reuse code from an old project, we always bring it in as source. There's always something that needs tweaking, and its usually safer to tweak a project-specific version than to tweak a shared version that can wind up breaking the previous project. Going back and fixing the previous project is out of the question because 1) it worked (and shipped) already, 2) it's no longer funded, and 3) the test hardware needed may no longer be available.

For example, we had a communication library that had an API for sending a "message", a block of data with a message ID, over a socket, pipe, whatever:

void Foo:Send(unsigned messageID, const void* buffer, size_t bufSize);

But in a later project, we needed an optimization: the message needed to consist of several blocks of data in different parts of memory concatenated together, and we couldn't (and didn't want to, anyway) do the pointer math to create the data in its "assembled" form in the first place, and the process of copying the parts together into a unified buffer was taking too long. So we added a new API:

void Foo:SendMultiple(unsigned messageID, const void** buffer, size_t* bufSize);

Which would assemble the buffers into a message and send it. (The base class's method allocated a temporary buffer, copied the parts together, and called Foo::Send(); subclasses could use this as a default or override it with their own, e.g. the class that sent the message on a socket would just call send() for each buffer, eliminating a lot of copies.)

Now, by doing this, we have the option of backporting (copying, really) the changes to the older version, but we're not required to backport. This gives the managers flexibility, based on the time and funding constraints they have.

EDIT: After reading Neil's comment, I thought of something that we do that I need to clarify.

In our code, we do lots of "libraries". LOTS of them. One big program I wrote had something like 50 of them. Because, for us and with our build setup, they're easy.

We use a tool that auto-generates makefiles on the fly, taking care of dependencies and almost everything. If there's anything strange that needs to be done, we write a file with the exceptions, usually just a few lines.

It works like this: The tool finds everything in the directory that looks like a source file, generates dependencies if the file changed, and spits out the needed rules. Then it makes a rule to take eveything and ar/ranlib it into a libxxx.a file, named after the directory. All the objects and library are put in a subdirectory that is named after the target platform (this makes cross-compilation easy to support). This process is then repeated for every subdirectory (except the object file subdirs). Then the top-level directory gets linked with all the subdirs' libraries into the executable, and a symlink is created, again, naked after the top-level directory.

So directories are libraries. To use a library in a program, make a symbolic link to it. Painless. Ergo, everything's partitioned into libraries from the outset. If you want a shared lib, you put a ".so" suffix on the directory name.

To pull in a library from another project, I just use a Subversion external to fetch the needed directories. The symlinks are relative, so as long as I don't leave something behind it still works. When we ship, we lock the external reference to a specific revision of the parent.

If we need to add functionality to a library, we can do one of several things. We can revise the parent (if it's still an active project and thus testable), tell Subversion to use the newer revision and fix any bugs that pop up. Or we can just clone the code, replacing the external link, if messing with the parent is too risky. Either way, it still looks like a "library" to us, but I'm not sure that it matches the spirit of a library.

We're in the process of moving to Mercurial, which has no "externals" mechanism so we have to either clone the libraries in the first place, use rsync to keep the code synced between the different repositories, or force a common directory structure so you can have hg pull from multiple parents. The last option seems to be working pretty well.

回复收藏 0 原文

~没有更多了~