使用许多 SWIG 生成的模块时避免重复的 SWIG 样板

发布于 2024-12-16 20:26:14 字数 1930 浏览 2 评论 0原文

使用 SWIG 生成接口模块时,生成的 C/C++ 文件包含大量静态样板函数。因此,如果想要通过在同一应用程序中使用许多单独编译的小接口来模块化 SWIG 生成的接口的使用,最终会由于这些重复的功能而导致大量膨胀。

使用 gcc 的 -ffunction-sections 选项和 GNU 链接器的 --icf=safe 选项(-Wl,--icf=safe)编译器),人们可以删除一些重复,但绝不是全部(我认为它不会合并任何有重定位的东西——许多函数都会这样做)。

我的问题:我想知道是否有一种方法可以删除更多重复的样板文件,最好是不依赖于 GNU 特定的编译器/链接器选项的方法。

特别是,是否有一个 SWIG 选项/标志/某些内容表示“不在每个输出文件中包含样板”?实际上一个 SWIG 选项,-external-runtime 告诉它生成一个“仅样板”输出文件,但没有明显的方法来抑制每个文件中包含的副本正常输出文件。 [我认为这种事情在 SWIG 中实现起来应该相当简单,所以我很惊讶它似乎不存在......但我似乎找不到任何记录。]

这是一个小例子:

给定模块 swt_oink 的接口文件 swg-oink.swg

%module swt_oink
%{ extern int oinker (const char *x); %}
extern int oinker (const char *x);

...以及 swt_barf 的类似接口 swg-barf.swg

%module swt_barf
%{ extern int barfer (const char *x); %}
extern int barfer (const char *x);

...和一个测试主文件,swt-main.cc

extern "C"
{
#include "lua.h"
#include "lualib.h"
#include "lauxlib.h"

extern int luaopen_swt_oink (lua_State *);
extern int luaopen_swt_barf (lua_State *);
}

int main ()
{
  lua_State *L = lua_open();
  luaopen_swt_oink (L);
  luaopen_swt_barf (L);
}

int oinker (const char *) { return 7; }
int barfer (const char *) { return 2; }

并像这样编译它们:

swig -lua -c++ swt-oink.swg
g++ -c -I/usr/include/lua5.1 swt-oink_wrap.cxx
swig -lua -c++ swt-barf.swg
g++ -c -I/usr/include/lua5.1 swt-barf_wrap.cxx
g++ -c -I/usr/include/lua5.1 swt-main.cc
g++ -o swt swt-main.o swt-oink_wrap.o swt-barf_wrap.o

然后每个xxx_wrap.o文件的大小约为16KB,其中 95% 是样板文件,最终可执行文件的大小大致是这些的总和,大约 39K。如果使用 -ffunction-sections 编译每个接口文件,并使用 -Wl,--icf=safe 链接,则最终可执行文件的大小为 34KB,但仍然存在显然存在大量重复(在可执行文件上使用 nm 可以看到多次定义的大量函数,并且查看它们的源代码,很明显,对于大多数函数使用单个全局定义是可以的他们)。

When generating an interface module with SWIG, the generated C/C++ file contains a ton of static boilerplate functions. So if one wants to modularize the use of SWIG-generated interfaces by using many separately compiled small interfaces in the same application, there ends up being a lot of bloat due to these duplicate functions.

Using gcc's -ffunction-sections option, and the GNU linker's --icf=safe option (-Wl,--icf=safe to the compiler), one can remove some of the duplication, but by no means all of it (I think it won't coalesce anything that has a relocation in it—which many of these functions do).

My question: I'm wondering if there's a way to remove more of this duplicated boilerplate, ideally one that doesn't rely on GNU-specific compiler/linker options.

In particular, is there a SWIG option/flag/something that says "don't include boilerplate in each output file"? There actually is a SWIG option, -external-runtime that tells it to generate a "boilerplate-only" output file, but no apparent way of suppressing the copy included in each normal output file. [I think this sort of thing should be fairly simple to implement in SWIG, so I'm surprised that it doesn't seem to exist... but I can't seem to find anything documented.]

Here's a small example:

Given the interface file swg-oink.swg for module swt_oink:

%module swt_oink
%{ extern int oinker (const char *x); %}
extern int oinker (const char *x);

... and a similar interface swg-barf.swg for swt_barf:

%module swt_barf
%{ extern int barfer (const char *x); %}
extern int barfer (const char *x);

... and a test main file, swt-main.cc:

extern "C"
{
#include "lua.h"
#include "lualib.h"
#include "lauxlib.h"

extern int luaopen_swt_oink (lua_State *);
extern int luaopen_swt_barf (lua_State *);
}

int main ()
{
  lua_State *L = lua_open();
  luaopen_swt_oink (L);
  luaopen_swt_barf (L);
}

int oinker (const char *) { return 7; }
int barfer (const char *) { return 2; }

and compiling them like:

swig -lua -c++ swt-oink.swg
g++ -c -I/usr/include/lua5.1 swt-oink_wrap.cxx
swig -lua -c++ swt-barf.swg
g++ -c -I/usr/include/lua5.1 swt-barf_wrap.cxx
g++ -c -I/usr/include/lua5.1 swt-main.cc
g++ -o swt swt-main.o swt-oink_wrap.o swt-barf_wrap.o

then the size of each xxx_wrap.o file is about 16KB, of which 95% is boilerplate, and the size of the final executable is roughly the sum of these, about 39K. If one compiles each interface file with -ffunction-sections, and links with -Wl,--icf=safe, the size of the final executable is 34KB, but there's still clearly a lot of duplication (using nm on the executable one can see tons of functions defined multiple times, and looking at their source, it's clear that it would be fine to use a single global definition for most of them).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

蘸点软妹酱 2024-12-23 20:26:15

我相当确定 SWIG 没有执行此操作的选项。我现在正在猜测,但我认为原因很可能是担心使用不同版本的 SWIG 构建的模块的可见性。想象一下以下场景:

两个库 X 和 Y 都使用 SWIG 为其代码提供接口。他们都选择使“SWIG 粘合”内容在不同的翻译单元中可见,以减少代码大小。如果 X 和 Y 使用相同版本的 SWIG,这一切都会很好。如果 X 使用 SWIG 1.1 而 Y 使用 SWIG 1.3 会发生什么?这两个模块单独工作都很好,但是根据平台如何处理共享对象以及语言本身如何加载它们(RTLD_GLOBAL?),两个模块的组合可能会发生一些潜在的非常糟糕的事情在同一个虚拟机中使用。

我怀疑代码重复的惩罚相当低——虚拟机和本机代码之间交换的成本通常相当高,这可能使稍微减少的指令缓存命中相形见绌,尽管看到真正的基准测试可能很有趣。从好的方面来说,这是用户不需要担心的代码,因为它都是自动生成的,并且都正确地保存在为相应版本编写的接口中。

I'm fairly sure SWIG doesn't have an option for doing this. I'm speculating now, but I think the reason might well be concern about visibility of this for modules built with different versions of SWIG. Imagine the following scenario:

Two libraries X and Y both provide an interface to their code using SWIG. They both opt to make the "SWIG glue" stuff visible across different translation units in order to reduce code size. This will all be well and good if both X and Y are using the same version of SWIG. What happens though if X uses SWIG 1.1 and Y uses SWIG 1.3? Both modules work fine on their own, but depending on how the platform treats shared objects and how the language itself loads them (RTLD_GLOBAL?) some potentially very bad things would happen from the combination of the two modules being used in the same VM.

The penalty of the code duplication is pretty low I suspect - the cost of swapping between VM and native code is typically quite high, which probably dwarfs the slightly reduced instruction cache hits, although it might be interesting to see real benchmarks. On the up side this is code no users ever need to worry about it, since it's all auto generated and all correctly kept with interfaces written for the corresponding version.

比忠 2024-12-23 20:26:15

我可能有点晚了,但这里有一个解决方法:

  • 在 SWIG (<= 1.3 ) 中,有 -noruntime 命令行选项
  • 由于 SWIG 2.0 -noruntime 已弃用,所以现在应该将 -DSWIG_NOINCLUDE 传递给 C 预处理器 - 而不是传递给 swig 本身,

我完全不确定这是正确的,但它至少适用于 我。我将在 SWIG 的邮件列表中澄清这个问题。

I might be a bit late, but here is a workaround:

  • In SWIG (<= 1.3 ) there is -noruntime command-line option
  • Since SWIG 2.0 -noruntime was deprecated, so now one should pass -DSWIG_NOINCLUDE to the C preprocessor - not to the swig itself

I am completely not sure that this is correct, but it at least works for me. I am going to clarify this question in the SWIG's mailing list.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文