重写 C 中的函数调用

发布于 2024-07-14 10:18:11 字数 577 浏览 9 评论 0原文

为了记录调用，我想覆盖对各种 API 的某些函数调用，但我也可能想在将数据发送到实际函数之前对其进行操作。

例如，假设我在源代码中使用了名为 getObjectName 的函数数千次。有时我想暂时覆盖此函数，因为我想更改此函数的行为以查看不同的结果。

我创建了一个像这样的新源文件：

#include <apiheader.h>    

const char *getObjectName (object *anObject)
{
    if (anObject == NULL)
        return "(null)";
    else
        return "name should be here";
}

我像平常一样编译所有其他源代码，但在链接到 API 库之前，我首先将其链接到此函数。这工作得很好，只是我显然无法在我的重写函数中调用真正的函数。

有没有更简单的方法来“覆盖”函数而不会出现链接/编译错误/警告？理想情况下，我希望能够通过编译和链接一个或两个额外的文件来覆盖该函数，而不是摆弄链接选项或更改程序的实际源代码。

原文

I want to override certain function calls to various APIs for the sake of logging the calls, but I also might want to manipulate data before it is sent to the actual function.

For example, say I use a function called getObjectName thousands of times in my source code. I want to temporarily override this function sometimes because I want to change the behaviour of this function to see the different result.

I create a new source file like this:

#include <apiheader.h>    

const char *getObjectName (object *anObject)
{
    if (anObject == NULL)
        return "(null)";
    else
        return "name should be here";
}

I compile all my other source as I normally would, but I link it against this function first before linking with the API's library. This works fine except I can obviously not call the real function inside my overriding function.

Is there an easier way to "override" a function without getting linking/compiling errors/warnings? Ideally I want to be able to override the function by just compiling and linking an extra file or two rather than fiddle around with linking options or altering the actual source code of my program.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

凉城凉梦凉人心 2024-07-21 10:18:12

使用 gcc，在 Linux 下您可以使用 --wrap 链接器标志，如下所示：

gcc program.c -Wl,-wrap,getObjectName -o program

并将您的函数定义为：

const char *__wrap_getObjectName (object *anObject)
{
    if (anObject == NULL)
        return "(null)";
    else
        return __real_getObjectName( anObject ); // call the real function
}

这将确保对 getObjectName() 的所有调用都重新路由到你的包装函数（在链接时）。然而，Mac OS X 下的 gcc 中没有这个非常有用的标志。

如果您使用 g++ 进行编译，请记住使用 extern "C" 声明包装函数。

With gcc, under Linux you can use the --wrap linker flag like this:

gcc program.c -Wl,-wrap,getObjectName -o program

and define your function as:

const char *__wrap_getObjectName (object *anObject)
{
    if (anObject == NULL)
        return "(null)";
    else
        return __real_getObjectName( anObject ); // call the real function
}

This will ensure that all calls to getObjectName() are rerouted to your wrapper function (at link time). This very useful flag is however absent in gcc under Mac OS X.

Remember to declare the wrapper function with extern "C" if you're compiling with g++ though.

回复收藏 0 原文

心奴独伤 2024-07-21 10:18:12

如果您只想捕获/修改调用的源代码，最简单的解决方案是将头文件 (intercept.h) 放在一起：

#ifdef INTERCEPT
    #define getObjectName(x) myGetObjectName(x)
#endif

然后按如下方式实现该函数（在 < code>intercept.c 其中不包含intercept.h）：

const char *myGetObjectName (object *anObject) {
    if (anObject == NULL) return "(null)";
    return getObjectName(anObject);

然后确保要拦截调用的每个源文件都具有以下内容顶部：

#include "intercept.h"

当您使用“-DINTERCEPT”进行编译时，所有文件都会调用您的函数而不是真正的函数，而您的函数仍然会调用真正的函数。

不使用“-DINTERCEPT”进行编译将防止发生拦截。

如果您想拦截所有调用（而不仅仅是来自源代码的调用），那就有点棘手了 - 这通常可以通过动态加载和解析实际函数来完成（使用 dlload- 和 dlsym - 类型调用）但我认为在你的情况下没有必要。

If it's only for your source that you want to capture/modify the calls, the simplest solution is to put together a header file (intercept.h) with:

#ifdef INTERCEPT
    #define getObjectName(x) myGetObjectName(x)
#endif

Then you implement the function as follows (in intercept.c which doesn't include intercept.h):

const char *myGetObjectName (object *anObject) {
    if (anObject == NULL) return "(null)";
    return getObjectName(anObject);

Then make sure each source file where you want to intercept the call has the following at the top:

#include "intercept.h"

When you compile with "-DINTERCEPT", all files will call your function rather than the real one, whereas your function will still call the real one.

Compiling without the "-DINTERCEPT" will prevent interception from occurring.

It's a bit trickier if you want to intercept all calls (not just those from your source) - this can generally be done with dynamic loading and resolution of the real function (with dlload- and dlsym-type calls) but I don't think it's necessary in your case.

回复收藏 0 原文

拥抱影子 2024-07-21 10:18:12

您可以使用LD_PRELOAD技巧覆盖函数 - 请参阅man ld.so。您可以使用您的函数编译共享库并启动二进制文件（您甚至不需要修改二进制文件！），例如LD_PRELOAD=mylib.so myprog。

在函数体中（在共享库中），您可以这样编写：

const char *getObjectName (object *anObject) {
  static char * (*func)();

  if(!func)
    func = (char *(*)()) dlsym(RTLD_NEXT, "getObjectName");
  printf("Overridden!\n");     
  return(func(anObject));    // call original function
}

您可以覆盖共享库中的任何函数，甚至是 stdlib 中的函数，而无需修改/重新编译程序，因此您可以对没有的程序进行操作来源为. 不是很好吗？

You can override a function using LD_PRELOAD trick - see man ld.so. You compile shared lib with your function and start the binary (you even don't need to modify the binary!) like LD_PRELOAD=mylib.so myprog.

In the body of your function (in shared lib) you write like this:

const char *getObjectName (object *anObject) {
  static char * (*func)();

  if(!func)
    func = (char *(*)()) dlsym(RTLD_NEXT, "getObjectName");
  printf("Overridden!\n");     
  return(func(anObject));    // call original function
}

You can override any function from shared library, even from stdlib, without modifying/recompiling the program, so you could do the trick on programs you don't have a source for. Isn't it nice?

回复收藏 0 原文

木森分化 2024-07-21 10:18:12

如果您使用 GCC，您可以使您的函数弱。这些可以被覆盖< /a> 通过非弱函数：

test.c：

#include <stdio.h>

__attribute__((weak)) void test(void) { 
    printf("not overridden!\n"); 
}

int main() {
    test();
}

它有什么作用？

$ gcc test.c
$ ./a.out
not overridden!

test1.c：

#include <stdio.h>

void test(void) {
    printf("overridden!\n");
}

它有什么作用？

$ gcc test1.c test.c
$ ./a.out
overridden!

遗憾的是，这不适用于其他编译器。但是，您可以在自己的文件中包含包含可重写函数的弱声明，如果您使用 GCC 进行编译，则只需将 include 放入 API 实现文件中即可：

weakdecls.h:

__attribute__((weak)) void test(void);
... other weak function declarations ...

functions.c< /strong>：

/* for GCC, these will become weak definitions */
#ifdef __GNUC__
#include "weakdecls.h"
#endif

void test(void) { 
    ...
}

... other functions ...

这样做的缺点是，如果不对 api 文件执行某些操作（需要这三行和weakdecls），它就无法完全工作。但是，一旦进行了更改，就可以通过在一个文件中编写全局定义并将其链接到其中来轻松覆盖函数。

If you use GCC, you can make your function weak. Those can be overridden by non-weak functions:

test.c:

#include <stdio.h>

__attribute__((weak)) void test(void) { 
    printf("not overridden!\n"); 
}

int main() {
    test();
}

What does it do?

$ gcc test.c
$ ./a.out
not overridden!

test1.c:

#include <stdio.h>

void test(void) {
    printf("overridden!\n");
}

What does it do?

$ gcc test1.c test.c
$ ./a.out
overridden!

Sadly, that won't work for other compilers. But you can have the weak declarations that contain overridable functions in their own file, placing just an include into the API implementation files if you are compiling using GCC:

weakdecls.h:

__attribute__((weak)) void test(void);
... other weak function declarations ...

functions.c:

/* for GCC, these will become weak definitions */
#ifdef __GNUC__
#include "weakdecls.h"
#endif

void test(void) { 
    ...
}

... other functions ...

Downside of this is that it does not work entirely without doing something to the api files (needing those three lines and the weakdecls). But once you did that change, functions can be overridden easily by writing a global definition in one file and linking that in.

回复收藏 0 原文

睫毛溺水了 2024-07-21 10:18:12

您可以将函数指针定义为全局变量。调用者语法不会改变。当您的程序启动时，它可以检查某些命令行标志或环境变量是否设置为启用日志记录，然后保存函数指针的原始值并将其替换为您的日志记录函数。您不需要特殊的“启用日志记录”构建。用户可以“在现场”启用日志记录。

您需要能够修改调用者的源代码，但不能修改被调用者的源代码（因此这在调用第三方库时可以工作）。

foo.h:

typedef const char* (*GetObjectNameFuncPtr)(object *anObject);
extern GetObjectNameFuncPtr GetObjectName;

foo.cpp:

const char* GetObjectName_real(object *anObject)
{
    return "object name";
}

const char* GetObjectName_logging(object *anObject)
{
    if (anObject == null)
        return "(null)";
    else
        return GetObjectName_real(anObject);
}

GetObjectNameFuncPtr GetObjectName = GetObjectName_real;

void main()
{
    GetObjectName(NULL); // calls GetObjectName_real();

    if (isLoggingEnabled)
        GetObjectName = GetObjectName_logging;

    GetObjectName(NULL); // calls GetObjectName_logging();
}

You can define a function pointer as a global variable. The callers syntax would not change. When your program starts, it could check if some command-line flag or environment variable is set to enable logging, then save the function pointer's original value and replace it with your logging function. You would not need a special "logging enabled" build. Users could enable logging "in the field".

You will need to be able to modify the callers' source code, but not the callee (so this would work when calling third-party libraries).

foo.h:

typedef const char* (*GetObjectNameFuncPtr)(object *anObject);
extern GetObjectNameFuncPtr GetObjectName;

foo.cpp:

const char* GetObjectName_real(object *anObject)
{
    return "object name";
}

const char* GetObjectName_logging(object *anObject)
{
    if (anObject == null)
        return "(null)";
    else
        return GetObjectName_real(anObject);
}

GetObjectNameFuncPtr GetObjectName = GetObjectName_real;

void main()
{
    GetObjectName(NULL); // calls GetObjectName_real();

    if (isLoggingEnabled)
        GetObjectName = GetObjectName_logging;

    GetObjectName(NULL); // calls GetObjectName_logging();
}

回复收藏 0 原文

动听の歌 2024-07-21 10:18:12

基于 @Johannes Schaub 的答案，提供适合您不拥有的代码的解决方案。

将要重写的函数别名为弱定义函数，然后自己重新实现它。

override.h

#define foo(x) __attribute__((weak))foo(x)

foo.c

function foo() { return 1234; }

override.c

function foo() { return 5678; }

使用特定于模式的变量值，以添加编译器标志 -include override.h。

%foo.o: ALL_CFLAGS += -include override.h

旁白：也许您还可以使用 -D 'foo(x) __attribute__((weak))foo(x)' 来定义宏。

编译该文件并将其与您的重新实现 (override.c) 链接。

这允许您覆盖任何源文件中的单个函数，而无需修改代码。
这允许
缺点是您必须为要覆盖的每个文件使用单独的头文件。
缺点是

Building on @Johannes Schaub's answer with a solution suitable for code you don't own.

Alias the function you want to override to a weakly-defined function, and then reimplement it yourself.

override.h

#define foo(x) __attribute__((weak))foo(x)

foo.c

function foo() { return 1234; }

override.c

function foo() { return 5678; }

Use pattern-specific variable values in your Makefile to add the compiler flag -include override.h.

%foo.o: ALL_CFLAGS += -include override.h

Aside: Perhaps you could also use -D 'foo(x) __attribute__((weak))foo(x)' to define your macros.

Compile and link the file with your reimplementation (override.c).

This allows you to override a single function from any source file, without having to modify the code.
The downside is that you must use a separate header file for each file you want to override.

回复收藏 0 原文

冬天的雪花 2024-07-21 10:18:12

在涉及两个存根库的链接器中还有一种棘手的方法。

库 #1 与主库链接，并公开以另一个名称重新定义的符号。

库 #2 与库 #1 链接，拦截调用并调用库 #1 中重新定义的版本。

请务必小心此处的链接订单，否则将无法正常工作。

回复收藏 0 原文

在你怀里撒娇 2024-07-21 10:18:12

以下是我的实验。正文和最后有4个结论。

短版本

一般来说，要成功重写一个函数，你必须考虑：

弱属性
翻译单元的安排

长版本

我有这些源文件。

.
├── decl.h
├── func3.c
├── main.c
├── Makefile1
├── Makefile2
├── override.c
├── test_target.c
└── weak_decl.h

main.c

#include <stdio.h>

void main (void)
{
    func1();    
}

test_target.c

#include <stdio.h>

void func3(void);

void func2 (void)
{
    printf("in original func2()\n");
}

void func1 (void)
{
    printf("in original func1()\n");
    func2();
    func3();
}

func3.c

#include <stdio.h>

void func3 (void)
{
    printf("in original func3()\n");
}

decl.h

void func1 (void);
void func2 (void);
void func3 (void);

weak_decl.h

void func1 (void);

__attribute__((weak))
void func2 (void);

__attribute__((weak))
void func3 (void);

覆盖.c

#include <stdio.h>

void func2 (void)
{
    printf("in mock func2()\n");
}

void func3 (void)
{
    printf("in mock func3()\n");
}

Makefile1：

ALL:
    rm -f *.o *.a
    gcc -c override.c -o override.o
    gcc -c func3.c -o func3.o
    gcc -c test_target.c -o test_target_weak.o -include weak_decl.h
    ar cr all_weak.a test_target_weak.o func3.o
    gcc main.c all_weak.a override.o -o main -include decl.h

Makefile2：

ALL:
    rm -f *.o *.a
    gcc -c override.c -o override.o
    gcc -c func3.c -o func3.o
    gcc -c test_target.c -o test_target_strong.o -include decl.h # HERE -include differs!!
    ar cr all_strong.a test_target_strong.o func3.o
    gcc main.c all_strong.a override.o -o main -include decl.h

Makefile1 结果的输出：

in original func1()
in mock func2()
in mock func3()

Makefile2 的输出：

rm *.o *.a
gcc -c override.c -o override.o
gcc -c func3.c -o func3.o
gcc -c test_target.c -o test_target_strong.o -include decl.h # -include differs!!
ar cr all_strong.a test_target_strong.o func3.o
gcc main.c all_strong.a override.o -o main -include decl.h 
override.o: In function `func2':
override.c:(.text+0x0): multiple definition of `func2'  <===== HERE!!!
all_strong.a(test_target_strong.o):test_target.c:(.text+0x0): first defined here
override.o: In function `func3':
override.c:(.text+0x13): multiple definition of `func3' <===== HERE!!!
all_strong.a(func3.o):func3.c:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status
Makefile4:2: recipe for target 'ALL' failed
make: *** [ALL] Error 1

符号表：

all_weak.a：

test_target_weak.o:
0000000000000013 T func1  <=== 13 is the offset of func1 in test_target_weak.o, see below disassembly
0000000000000000 W func2  <=== func2 is [W]eak symbol with default value assigned
                 w func3  <=== func3 is [w]eak symbol without default value
                 U _GLOBAL_OFFSET_TABLE_
                 U puts

func3.o:
0000000000000000 T func3 <==== func3 is a strong symbol
                 U _GLOBAL_OFFSET_TABLE_
                 U puts

all_strong.a：

test_target_strong.o:
0000000000000013 T func1
0000000000000000 T func2 <=== func2 is strong symbol
                 U func3 <=== func3 is undefined symbol, there's no address value on the left-most column because func3 is not defined in test_target_strong.c
                 U _GLOBAL_OFFSET_TABLE_
                 U puts

func3.o:
0000000000000000 T func3  <=== func3 is strong symbol
                 U _GLOBAL_OFFSET_TABLE_
                 U puts

在这两种情况下，override.o 符号：

0000000000000000 T func2  <=== func2 is strong symbol
0000000000000013 T func3  <=== func3 is strong symbol
                 U _GLOBAL_OFFSET_TABLE_
                 U puts

反汇编：

test_target_weak.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <func2>: <===== HERE func2 offset is 0
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   48 8d 3d 00 00 00 00    lea    0x0(%rip),%rdi        # b <func2+0xb>
   b:   e8 00 00 00 00          callq  10 <func2+0x10>
  10:   90                      nop
  11:   5d                      pop    %rbp
  12:   c3                      retq   

0000000000000013 <func1>: <====== HERE func1 offset is 13
  13:   55                      push   %rbp
  14:   48 89 e5                mov    %rsp,%rbp
  17:   48 8d 3d 00 00 00 00    lea    0x0(%rip),%rdi        # 1e <func1+0xb>
  1e:   e8 00 00 00 00          callq  23 <func1+0x10>
  23:   e8 00 00 00 00          callq  28 <func1+0x15>
  28:   e8 00 00 00 00          callq  2d <func1+0x1a>
  2d:   90                      nop
  2e:   5d                      pop    %rbp
  2f:   c3                      retq

所以结论是：

.o 文件中定义的函数可以覆盖 .a 文件中定义的相同函数。在上面的Makefile1中，override.o中的func2()和func3()覆盖了中的对应部分>all_weak.a。我尝试使用两个 .o 文件，但它不起作用。
对于 GCC，您不需要将函数拆分为单独的 .o 文件，如此处用于 Visual Studio 工具链。我们可以在上面的示例中看到，func2()（与func1()在同一个文件中）和func3()（在单独的文件中）文件）可以被覆盖。
要覆盖某个函数，在编译其使用者的翻译单元时，您需要将该函数指定为弱函数。这会在 consumer.o 中将该函数记录为弱函数。在上面的示例中，当编译 test_target.c 时，它使用 func2() 和 func3()，您需要添加 -包括weak_decl.h，它将func2()和func3()声明为weak。 func2() 也在 test_target.c 中定义，但没问题。

一些进一步的实验

仍然使用上面的源文件。但稍微改变一下override.c：

覆盖.c

#include <stdio.h>

void func2 (void)
{
    printf("in mock func2()\n");
}

// void func3 (void)
// {
//     printf("in mock func3()\n");
// }

这里我删除了 func3() 的覆盖版本。 我这样做是因为我想回退到 func3.c 中的原始 func3() 实现。

我仍然使用 Makefile1构建。构建没问题。但是运行时错误如下：

xxx@xxx-host:~/source/override$ ./main
in original func1()
in mock func2()
Segmentation fault (core dumped)

所以我检查了最终的main 的符号：

0000000000000696 T func1
00000000000006b3 T func2
                 w func3

所以我们可以看到 func3 没有有效的地址。这就是段错误发生的原因。

那么为什么呢？我没有将 func3.o 添加到 all_weak.a 存档文件中吗？

ar cr all_weak.a func3.o test_target_weak.o

我对 func2 尝试了同样的操作，其中我从 中删除了 func2 实现ovrride.c。但这次没有出现段错误。

覆盖.c

#include <stdio.h>

// void func2 (void)
// {
//     printf("in mock func2()\n");
// }

void func3 (void)
{
    printf("in mock func3()\n");
}

输出：

xxx@xxx-host:~/source/override$ ./main
in original func1()
in original func2()  <====== the original func2() is invoked as a fall back
in mock func3()

我的猜测是，因为 func2 是在与 func1 相同的文件/翻译单元中定义的。因此，func2 始终与 func1 一起引入。因此，链接器始终可以解析 func2，无论是来自 test_target.c 还是 override.c。

但对于func3，它是在单独的文件/翻译单元（func3.c）中定义的。如果它被声明为弱，消费者 test_target.o 仍会将 func3() 记录为弱。但不幸的是，GCC 链接器不会检查同一 .a 文件中的其他 .o 文件来查找 func3() 的实现. 虽然它确实存在。

all_weak.a：

func3.o:
0000000000000000 T func3 <========= func3 is indeed here!
                 U _GLOBAL_OFFSET_TABLE_
                 U puts

test_target_weak.o:
0000000000000013 T func1
0000000000000000 W func2
                 w func3
                 U _GLOBAL_OFFSET_TABLE_
                 U puts

所以我必须在 override.c 中提供一个覆盖版本，否则 func3() 无法解析。

但我还是不明白为什么GCC会这样。如果有人可以解释一下。

（2021 年 8 月 8 日上午 9:01 更新：

所以进一步的结论是：

如果你声明一些符号为弱，你最好提供全部弱功能。否则，除非原始版本位于调用者/使用者的同一文件/翻译单元内，否则无法解析原始版本。

Below are my experiments. There are 4 conclusions in the body and in the end.

Short Version

Generally speaking, to successfully override a function, you have to consider:

weak attribute
translation unit arrangement

Long Version

I have these source files.

.
├── decl.h
├── func3.c
├── main.c
├── Makefile1
├── Makefile2
├── override.c
├── test_target.c
└── weak_decl.h

main.c

#include <stdio.h>

void main (void)
{
    func1();    
}

test_target.c

#include <stdio.h>

void func3(void);

void func2 (void)
{
    printf("in original func2()\n");
}

void func1 (void)
{
    printf("in original func1()\n");
    func2();
    func3();
}

func3.c

#include <stdio.h>

void func3 (void)
{
    printf("in original func3()\n");
}

decl.h

void func1 (void);
void func2 (void);
void func3 (void);

weak_decl.h

void func1 (void);

__attribute__((weak))
void func2 (void);

__attribute__((weak))
void func3 (void);

override.c

#include <stdio.h>

void func2 (void)
{
    printf("in mock func2()\n");
}

void func3 (void)
{
    printf("in mock func3()\n");
}

Makefile1:

ALL:
    rm -f *.o *.a
    gcc -c override.c -o override.o
    gcc -c func3.c -o func3.o
    gcc -c test_target.c -o test_target_weak.o -include weak_decl.h
    ar cr all_weak.a test_target_weak.o func3.o
    gcc main.c all_weak.a override.o -o main -include decl.h

Makefile2:

ALL:
    rm -f *.o *.a
    gcc -c override.c -o override.o
    gcc -c func3.c -o func3.o
    gcc -c test_target.c -o test_target_strong.o -include decl.h # HERE -include differs!!
    ar cr all_strong.a test_target_strong.o func3.o
    gcc main.c all_strong.a override.o -o main -include decl.h

Output for Makefile1 result:

in original func1()
in mock func2()
in mock func3()

Output for Makefile2:

rm *.o *.a
gcc -c override.c -o override.o
gcc -c func3.c -o func3.o
gcc -c test_target.c -o test_target_strong.o -include decl.h # -include differs!!
ar cr all_strong.a test_target_strong.o func3.o
gcc main.c all_strong.a override.o -o main -include decl.h 
override.o: In function `func2':
override.c:(.text+0x0): multiple definition of `func2'  <===== HERE!!!
all_strong.a(test_target_strong.o):test_target.c:(.text+0x0): first defined here
override.o: In function `func3':
override.c:(.text+0x13): multiple definition of `func3' <===== HERE!!!
all_strong.a(func3.o):func3.c:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status
Makefile4:2: recipe for target 'ALL' failed
make: *** [ALL] Error 1

The symbol table:

all_weak.a:

test_target_weak.o:
0000000000000013 T func1  <=== 13 is the offset of func1 in test_target_weak.o, see below disassembly
0000000000000000 W func2  <=== func2 is [W]eak symbol with default value assigned
                 w func3  <=== func3 is [w]eak symbol without default value
                 U _GLOBAL_OFFSET_TABLE_
                 U puts

func3.o:
0000000000000000 T func3 <==== func3 is a strong symbol
                 U _GLOBAL_OFFSET_TABLE_
                 U puts

all_strong.a:

test_target_strong.o:
0000000000000013 T func1
0000000000000000 T func2 <=== func2 is strong symbol
                 U func3 <=== func3 is undefined symbol, there's no address value on the left-most column because func3 is not defined in test_target_strong.c
                 U _GLOBAL_OFFSET_TABLE_
                 U puts

func3.o:
0000000000000000 T func3  <=== func3 is strong symbol
                 U _GLOBAL_OFFSET_TABLE_
                 U puts

In both cases, the override.o symbols:

0000000000000000 T func2  <=== func2 is strong symbol
0000000000000013 T func3  <=== func3 is strong symbol
                 U _GLOBAL_OFFSET_TABLE_
                 U puts

disassembly:

test_target_weak.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <func2>: <===== HERE func2 offset is 0
   0:   55                      push   %rbp
   1:   48 89 e5                mov    %rsp,%rbp
   4:   48 8d 3d 00 00 00 00    lea    0x0(%rip),%rdi        # b <func2+0xb>
   b:   e8 00 00 00 00          callq  10 <func2+0x10>
  10:   90                      nop
  11:   5d                      pop    %rbp
  12:   c3                      retq   

0000000000000013 <func1>: <====== HERE func1 offset is 13
  13:   55                      push   %rbp
  14:   48 89 e5                mov    %rsp,%rbp
  17:   48 8d 3d 00 00 00 00    lea    0x0(%rip),%rdi        # 1e <func1+0xb>
  1e:   e8 00 00 00 00          callq  23 <func1+0x10>
  23:   e8 00 00 00 00          callq  28 <func1+0x15>
  28:   e8 00 00 00 00          callq  2d <func1+0x1a>
  2d:   90                      nop
  2e:   5d                      pop    %rbp
  2f:   c3                      retq

So the conclusion is:

A function defined in .o file can override the same function defined in .a file. In above Makefile1, the func2() and func3() in override.o overrides the counterparts in all_weak.a. I tried with both .o files but it don't work.
For GCC, You don't need to split the functions into separate .o files as said in here for Visual Studio toolchain. We can see in above example, both func2() (in the same file as func1()) and func3() (in a separate file) can be overridden.
To override a function, when compiling its consumer's translation unit, you need to specify that function as weak. That will record that function as weak in the consumer.o. In above example, when compiling the test_target.c, which consumes func2() and func3(), you need to add -include weak_decl.h, which declares func2() and func3() as weak. The func2() is also defined in test_target.c but it's OK.

Some further experiment

Still with the above source files. But change the override.c a bit:

override.c

#include <stdio.h>

void func2 (void)
{
    printf("in mock func2()\n");
}

// void func3 (void)
// {
//     printf("in mock func3()\n");
// }

Here I removed the override version of func3(). I did this because I want to fall back to the original func3() implementation in the func3.c.

I still use Makefile1 to build. The build is OK. But a runtime error happens as below:

xxx@xxx-host:~/source/override$ ./main
in original func1()
in mock func2()
Segmentation fault (core dumped)

So I checked the symbols of the final main:

0000000000000696 T func1
00000000000006b3 T func2
                 w func3

So we can see the func3 has no valid address. That's why segment fault happens.

So why? Didn't I add the func3.o into the all_weak.a archive file?

ar cr all_weak.a func3.o test_target_weak.o

I tried the same thing with func2, where I removed the func2 implementation from ovrride.c. But this time there's no segment fault.

override.c

#include <stdio.h>

// void func2 (void)
// {
//     printf("in mock func2()\n");
// }

void func3 (void)
{
    printf("in mock func3()\n");
}

Output:

xxx@xxx-host:~/source/override$ ./main
in original func1()
in original func2()  <====== the original func2() is invoked as a fall back
in mock func3()

My guess is, because func2 is defined in the same file/translation unit as func1. So func2 is always brought in with func1. So the linker can always resolve func2, be it from the test_target.c or override.c.

But for func3, it is defined in a separate file/translation unit (func3.c). If it is declared as weak, the consumer test_target.o will still record func3() as weak. But unfortunately the GCC linker will not check the other .o files from the same .a file to look for an implementation of func3(). Though it is indeed there.

all_weak.a:

func3.o:
0000000000000000 T func3 <========= func3 is indeed here!
                 U _GLOBAL_OFFSET_TABLE_
                 U puts

test_target_weak.o:
0000000000000013 T func1
0000000000000000 W func2
                 w func3
                 U _GLOBAL_OFFSET_TABLE_
                 U puts

So I must provide an override version in override.c otherwise the func3() cannot be resolved.

But I still don't know why GCC behaves like this. If someone can explain, please.

(Update 9:01 AM 8/8/2021:
this thread may explain this behavior, hopefully.)

So further conclusion is:

If you declare some symbol as weak, you'd better provide override versions of all the weak functions. Otherwise, the original version cannot be resolved unless it lives within the same file/translation unit of the caller/consumer.

回复收藏 0 原文