如何捕获无意的函数插入?

发布于 2024-09-01 02:27:55 字数 1062 浏览 4 评论 0原文

在阅读我的《专家 C 编程》一书时,我发现了关于函数插入的章节,以及如果无意中这样做会如何导致一些严重的难以发现的错误。

书中给出的例子如下:

my_source.c

mktemp() { ... }

main() {
  mktemp();
  getwd();
}

libc

mktemp(){ ... }
getwd(){ ...; mktemp(); ... }

根据书中的内容,在 main() 中发生的是 mktemp() (一个标准的 C 库函数) 由 my_source.c 中的实现插入。尽管让 main() 调用我的 mktemp() 实现是预期行为,但让 getwd() (另一个 C 库函数) 调用我的mktemp()实现不是。

显然,这个例子是 SunOS 4.0.3 版本的 lpr 中存在的一个现实错误。本书继续解释了修复方法是将关键字 static 添加到 my_source.cmktemp() 的定义中;尽管完全更改名称也应该可以解决这个问题。

本章留下了一些未解决的问题,希望你们能够回答:

  1. GCC 有办法警告函数插入吗?我们当然不希望发生这种情况,如果确实发生,我想知道。
  2. 我们的软件团队是否应该采取将关键字 static 放在所有我们不想公开的函数前面的做法?
  3. 静态库引入的函数可以进行插入吗?

感谢您的帮助。

编辑

我应该注意,我的问题不仅旨在插入标准 C 库函数,还包括其他库中包含的函数,也许是第三方库,也许是内部创建的库。本质上,我想捕获任何插入实例,无论插入函数位于何处。

Reading through my book Expert C Programming, I came across the chapter on function interpositioning and how it can lead to some serious hard to find bugs if done unintentionally.

The example given in the book is the following:

my_source.c

mktemp() { ... }

main() {
  mktemp();
  getwd();
}

libc

mktemp(){ ... }
getwd(){ ...; mktemp(); ... }

According to the book, what happens in main() is that mktemp() (a standard C library function) is interposed by the implementation in my_source.c. Although having main() call my implementation of mktemp() is intended behavior, having getwd() (another C library function) also call my implementation of mktemp() is not.

Apparently, this example was a real life bug that existed in SunOS 4.0.3's version of lpr. The book goes on to explain the fix was to add the keyword static to the definition of mktemp() in my_source.c; although changing the name altogether should have fixed this problem as well.

This chapter leaves me with some unresolved questions that I hope you guys could answer:

  1. Does GCC have a way to warn about function interposition? We certainly don't ever intend on this happening and I'd like to know about it if it does.
  2. Should our software group adopt the practice of putting the keyword static in front of all functions that we don't want to be exposed?
  3. Can interposition happen with functions introduced by static libraries?

Thanks for the help.

EDIT

I should note that my question is not just aimed at interposing over standard C library functions, but also functions contained in other libraries, perhaps 3rd party, perhaps ones created in-house. Essentially, I want to catch any instance of interpositioning regardless of where the interposed function resides.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(6

小苏打饼 2024-09-08 02:27:55

这确实是一个链接器问题。

当您编译一堆 C 源文件时,编译器将为每个文件创建一个目标文件。每个 .o 文件将包含该模块中的公共函数的列表,以及由模块中的代码调用的函数列表,但实际上并未在那里定义,即该模块期望某个库提供的函数。

当您将一堆 .o 文件链接在一起以生成可执行文件时,链接器必须解析所有这些丢失的引用。这是可以进行干预的点。如果存在对名为“mktemp”的函数的未解析引用,并且多个库提供了具有该名称的公共函数,那么它应该使用哪个版本?对此没有简单的答案,是的,如果选择了错误的选项,就会发生奇怪的事情

。所以,是的,在 C 中“静态”所有内容是一个好主意,除非您确实需要从其他源文件中使用它。事实上,在许多其他语言中,这是默认行为,如果您希望从外部访问它们,则必须将其标记为“公共”。

This is really a linker issue.

When you compile a bunch of C source files the compiler will create an object file for each one. Each .o file will contain a list of the public functions in this module, plus a list of functions that are called by code in the module, but are not actually defined there i.e. functions that this module is expecting some library to provide.

When you link a bunch of .o files together to make an executable the linker must resolve all of these missing references. This is the point where interposing can happen. If there are unresolved references to a function called "mktemp" and several libraries provide a public function with that name, which version should it use? There's no easy answer to this and yes odd things can happen if the wrong one is chosen

So yes, it's a good idea in C to "static" everything unless you really do need to use it from other source files. In fact in many other languages this is the default behavior and you have to mark things "public" if you want them accessible from outside.

拒绝两难 2024-09-08 02:27:55

听起来您想要的是让工具检测函数中是否存在名称冲突 - 即,您不希望外部可访问的函数名称意外地具有相同的名称,因此“覆盖”或隐藏具有相同名称的函数图书馆中的名字。

最近有一个与此问题相关的问题: Linking Libraries with Duplicate Class使用 GCC 命名

在您链接的所有库上使用 --whole-archive 选项可能会有所帮助(但正如我在那边的答案中提到的,我真的不知道效果如何这有效或者说服构建将该选项应用于所有库是多么容易)

It sounds like what you want is for the tools to detect that there are name conflicts in functions - ie., you don't want your externally accessible function names form accidentally having the same name and therefore 'override' or hide functions with the same name in a library.

There was a recent SO question related to this problem: Linking Libraries with Duplicate Class Names using GCC

Using the --whole-archive option on all the libraries you link against may help (but as I mentioned in the answer over there, I really don't know how well this works or how easy it is to convince builds to apply the option to all libraries)

看轻我的陪伴 2024-09-08 02:27:55

纯粹形式上,您描述的插入直接违反了 C 语言定义规则(ODR 规则,用 C++ 的说法)。任何像样的编译器都必须检测这些情况,或者提供检测它们的选项。在 C 语言中定义多个同名函数是非法的,无论这些函数是在哪里定义的(标准库、其他用户库等)。

我知道许多平台都提供了自定义[标准]库行为的方法通过将一些标准函数定义为弱符号。虽然这确实是一个有用的功能,但我相信编译器仍然必须为用户提供强制执行标准诊断的方法(最好是基于每个函数或每个库)。

因此,如果您的库中没有弱符号,那么您不必担心插入。如果您这样做(或者如果您怀疑您这样做),则必须查阅编译器文档以了解它是否为您提供了检查弱符号解析的方法。

例如,在 GCC 中,您可以使用 -fno-weak 禁用弱符号功能,但这基本上会杀死与弱符号相关的所有内容,这并不总是可取的。

Purely formally, the interpositioning you describe is a straightforward violation of C language definition rules (ODR rule, in C++ parlance). Any decent compiler must either detect these situations, or provide options for detecting them. It is simply illegal to define more than one function with the same name in C language, regardless of where these functions are defined (Standard library, other user library etc.)

I understand that many platforms provide means to customize the [standard] library behavior by defining some standard functions as weak symbols. While this is indeed a useful feature, I believe the compilers must still provide the user with means to enforce the standard diagnostics (on per-function or per-library basis preferably).

So, again, you should not worry about interpositioning if you have no weak symbols in your libraries. If you do (or if you suspect that you do), you have to consult your compiler documentation to find out if it offers you with means to inspect the weak symbol resolution.

In GCC, for example, you can disable the weak symbol functionality by using -fno-weak, but this basically kills everything related to weak symbols, which is not always desirable.

泪是无色的血 2024-09-08 02:27:55

如果该函数不需要在其所在的 C 文件之外进行访问,那么是的,我建议将该函数设置为静态

为了帮助解决这个问题,您可以做的一件事是使用具有可配置语法突出显示的编辑器。我个人使用 SciTE,并且我已将其配置为以红色显示所有标准库函数名称。这样,很容易发现我是否重复使用了不应该使用的名称(尽管编译器不会强制执行任何操作)。

If the function does not need to be accessed outside of the C file it lives in then yes, I would recommend making the function static.

One thing you can do to help catch this is to use an editor that has configurable syntax highlighting. I personally use SciTE, and I have configured it to display all standard library function names in red. That way, it's easy to spot if I am re-using a name I shouldn't be using (nothing is enforced by the compiler, though).

坐在坟头思考人生 2024-09-08 02:27:55

编写一个在所有 .o 文件和库上运行 nm -o 并检查程序和库中是否定义了外部名称的脚本相对容易。这只是 Unix 链接器不提供的众多合理服务之一,因为它停留在 1974 年,一次只查看一个文件。 (尝试以错误的顺序放置库,看看是否收到有用的错误消息!)

It's relatively easy to write a script that runs nm -o on all your .o files and your libraries and checks to see if an external name is defined both in your program and in a library. Just one of the many sane sensible services that the Unix linker doesn't provide because it's stuck in 1974, looking at one file at a time. (Try putting libraries in the wrong order and see if you get a useful error message!)

迷爱 2024-09-08 02:27:55

当链接器尝试链接单独的模块时,就会发生插入。
它不能发生在模块内。如果模块中存在重复符号,链接器会将其报告为错误。

对于 *nix 链接器来说,意外的插入是一个问题,链接器很难防范它。
出于此答案的目的,请考虑两个链接阶段:

  1. 链接器将翻译单元链接到模块中(基本上
    应用程序或库)。
  2. 链接器通过在模块中搜索来链接任何剩余的未找到的符号。

考虑“专家 C 编程”和 SiegeX 问题中描述的场景。
链接器首先尝试构建应用程序模块。
它确信符号 mktemp() 是外部符号,并尝试找到该符号的函数定义。链接器发现
应用程序模块的目标代码中函数的定义并将符号标记为已找到。
在此阶段,符号 mktemp() 已完全解析。不以任何方式考虑暂定,以便允许
另一个模块可能定义该符号的可能性。
在许多方面,这是有意义的,因为链接器应该首先尝试并解析模块内的外部符号
目前正在链接。在其他模块中链接时,它仅搜索未找到的符号。
此外,由于该符号已被标记为已解析,因此链接器将在任何应用程序中使用应用程序 mktemp()
其他需要解析此符号的情况。
因此,该库将使用 mktemp() 的应用程序版本。

防止该问题的一个简单方法是尝试使应用程序或库中的所有外部符号都是唯一的。
对于仅在有限的基础上共享的模块,通过确保所有模块都可以很容易地完成
模块中的外部符号通过附加唯一标识符来保持唯一。

对于广泛共享的模块来说,组成唯一的名称是一个问题。

The Interposistioning occurs when the linker is trying to link separate modules.
It cannot occur within a module. If there are duplicate symbols in a module the linker will report this as an error.

For *nix linkers, unintended Interposistioning is a problem and it is difficult for the linker to guard against it.
For the purposes of this answer consider the two linking stages:

  1. The linker links translation units into modulles (basically
    applications or libraries).
  2. The linker links any remaining unfound symbols by searching in modules.

Consider the scenario described in 'Expert C programming' and in SiegeX's question.
The linker fist tries to build the application module.
It sess that the symbol mktemp() is an external and tries to find a funcion definiton for the symbol. The linker finds
the definition for the function in the object code of the application module and marks the symbol as found.
At this stage the symbol mktemp() is completely resolved. It is not considered in any way tentative so as to allow
for the possibility that the anothere module might define the symbol.
In many ways this makes sense, since the linker should first try and resolve external symbols within the module it is
currently linking. It is only unfound symbols that it searches for when linking in other modules.
Furthermore, since the symbol has been marked as resolved, the linker will use the applications mktemp() in any
other cases where is needs to resolve this symbol.
Thus the applications version of mktemp() will be used by the library.

A simple way to guard agains the problem is to try and make all external sysmbols in your application or library unique.
For modules that are only going to shared on a limited basis, this can fairly easily be done by making sure all
extenal symbols in your module are unique by appending a unique identifier.

For modules that are widely shared making up unique names is a problem.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文