如何获取调用堆栈回溯？（深度嵌入，无库支持）

发布于 2024-09-12 20:25:23 字数 1476 浏览 9 评论 0原文

我希望我的异常处理程序和调试函数能够打印调用堆栈回溯，基本上就像 glibc 中的 backtrace() 库函数一样。不幸的是，我的 C 库（Newlib）不提供这样的调用。

我有这样的东西：

#include <unwind.h> // GCC's internal unwinder, part of libgcc
_Unwind_Reason_Code trace_fcn(_Unwind_Context *ctx, void *d)
{
    int *depth = (int*)d;
    printf("\t#%d: program counter at %08x\n", *depth, _Unwind_GetIP(ctx));
    (*depth)++;
    return _URC_NO_REASON;
}

void print_backtrace_here()
{
    int depth = 0;
    _Unwind_Backtrace(&trace_fcn, &depth);
}

它基本上可以工作，但生成的跟踪并不总是完整的。例如，如果我执行

int func3() { print_backtrace_here(); return 0; }
int func2() { return func3(); }
int func1() { return func2(); }
int main()  { return func1(); }

回溯，则仅显示 func3() 和 main()。（这显然是一个玩具示例，但我已经检查了反汇编并确认这些函数都完整地存在，并且没有优化或内联。）

更新：我在旧版本上尝试了这个回溯代码ARM7 系统，但具有相同（或至少尽可能等效）的编译器选项和链接器脚本，并且它打印正确的完整回溯（即 func1 和 func2 不丢失），实际上它甚至回溯过去的 main 进入引导初始化代码。因此，问题可能不在于链接器脚本或编译器选项。（此外，从反汇编中确认，在此 ARM7 测试中也没有使用帧指针）。

该代码是使用 -fomit-frame-pointer 编译的，但我的平台（裸机 ARM Cortex M3）定义了一个不使用帧指针的 ABI。（该系统的先前版本在 ARM7 上使用旧的 APCS ABI，具有强制堆栈帧和帧指针，以及像这里，效果很好）。

整个系统使用 -fexception 进行编译，这确保 _Unwind 使用的必要元数据包含在 ELF 文件中。（我认为 _Unwind 是为异常处理而设计的）。

所以，我的问题是： 是否有一种“标准”、公认的方法可以使用 GCC 在嵌入式系统中获取可靠的回溯？

我不介意在必要时使用链接器脚本和 crt0 代码，但不希望必须为工具链本身创造任何机会。

谢谢！

原文

I want my exception handlers and debug functions to be able to print call stack backtraces, basically just like the backtrace() library function in glibc. Unfortunately, my C library (Newlib) doesn't provide such a call.

I've got something like this:

#include <unwind.h> // GCC's internal unwinder, part of libgcc
_Unwind_Reason_Code trace_fcn(_Unwind_Context *ctx, void *d)
{
    int *depth = (int*)d;
    printf("\t#%d: program counter at %08x\n", *depth, _Unwind_GetIP(ctx));
    (*depth)++;
    return _URC_NO_REASON;
}

void print_backtrace_here()
{
    int depth = 0;
    _Unwind_Backtrace(&trace_fcn, &depth);
}

which basically works but the resulting traces aren't always complete. For example, if I do

int func3() { print_backtrace_here(); return 0; }
int func2() { return func3(); }
int func1() { return func2(); }
int main()  { return func1(); }

the backtrace only shows func3() and main(). (This is obv. a toy example, but I have checked the disassembly and confirmed that these functions are all here in full and not optimized out or inlined.)

Update: I tried this backtrace code on the old ARM7 system but with the same (or at least, as equivalent as possible) compiler options and linker script and it prints a correct, full backtrace (i.e. func1 and func2 aren't missing) and indeed it even backtraces up past main into the boot initialization code. So presumably the problem isn't with the linker script or compiler options. (Also, confirmed from disassembly that no frame pointer is used in this ARM7 test either).

The code is compiled with -fomit-frame-pointer, but my platform (bare metal ARM Cortex M3) defines an ABI that does not use a frame pointer anyway. (A previous version of this system used the old APCS ABI on ARM7 with forced stack frames and frame pointer, and an backtrace like the one here, which worked perfectly).

The whole system is compiled with -fexception, which ensures the necessary metadata that _Unwind uses is included in the ELF file. (_Unwind is designed for exception handling I think).

So, my question is:
Is there a "standard", accepted way of getting reliable backtraces in embedded systems using GCC?

I don't mind having to mess around with the linker scripts and crt0 code if necessary, but don't want to have to make any chances to the toolchain itself.

Thanks!

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

月下客 2024-09-19 20:25:23

为此，您需要 -funwind-tables 或 -fasynchronous-unwind-tables
在某些目标中，这是为了 _Unwind_Backtrace 正常工作所必需的！

回复收藏 0 原文

你是暖光i 2024-09-19 20:25:23

由于 ARM 平台不使用帧指针，因此您永远不知道堆栈帧有多大，也不能简单地将堆栈推出到 R14 中的单个返回值之外。

当调查我们没有调试符号的崩溃时，我们只需转储整个堆栈并查找与指令范围中的每个项目最接近的符号。它确实会产生大量误报，但对于调查崩溃仍然非常有用。

如果您运行纯 ELF 可执行文件，则可以将调试符号从发布可执行文件中分离出来。然后 gdb 可以帮助您从标准 unix 核心转储中找出发生了什么

回复收藏 0 原文

浮生面具三千个 2024-09-19 20:25:23

gcc 确实返回优化。在 func1() 和 func2() 中，它不会调用 func2()/func3() - 相反，它会跳转到 func2()/func3()，因此 func3() 可以立即返回到 main()。

在您的情况下， func1() 和 func2() 不需要设置堆栈帧，但如果它们这样做（例如对于局部变量），如果函数调用是最后一条指令，gcc 仍然可以进行优化 - 然后它会清理在跳转到 func3() 之前先向上堆栈。

查看生成的汇编代码即可看到它。

编辑/更新：

要验证这是否是原因，请在函数调用后执行一些编译器无法重新排序的操作（例如使用返回值）。
或者尝试使用 -O0 进行编译。

回复收藏 0 原文

束缚ｍ 2024-09-19 20:25:23

有些编译器（例如 GCC）会优化函数调用，就像您在示例中提到的那样。对于代码片段的操作，不需要在调用链中存储中间返回指针。从 func3() 返回到 main() 是完全可以的，因为中间函数除了调用另一个函数之外不会做任何额外的事情。

它与代码消除不同（实际上中间函数可以完全优化），并且单独的编译器参数可以控制这种优化。

如果您使用 GCC，请尝试 -fno-optimize-sibling-calls

另一个方便的 GCC 选项是 -mno-sched-prolog，它可以防止函数序言中的指令重新排序，这至关重要，如果您想逐字节解析代码，就像这里所做的那样：
http://www.kegel.com/stackcheck/checkstack-pl.txt

回复收藏 0 原文

夜唯美灬不弃 2024-09-19 20:25:23

这很 hacky，但考虑到所需的代码/RAM 空间量，我发现它足够好：

假设您使用 ARM THUMB 模式，请使用以下选项进行编译：

-mtpcs-frame -mtpcs-leaf-frame  -fno-omit-frame-pointer

以下函数用于检索调用堆栈。请参阅评论以获取更多信息：

/*
 * This should be compiled with:
 *  -mtpcs-frame -mtpcs-leaf-frame  -fno-omit-frame-pointer
 *
 *  With these options, the Stack pointer is automatically pushed to the stack
 *  at the beginning of each function.
 *
 *  This function basically iterates through the current stack finding the following combination of values:
 *  - <Frame Address>
 *  - <Link Address>
 *
 *  This combination will occur for each function in the call stack
 */
static void backtrace(uint32_t *caller_list, const uint32_t *caller_list_end, const uint32_t *stack_pointer)
{
    uint32_t previous_frame_address = (uint32_t)stack_pointer;
    uint32_t stack_entry_counter = 0;

    // be sure to clear the caller_list buffer
    memset(caller_list, 0, caller_list_end-caller_list);

    // loop until the buffer is full
    while(caller_list < caller_list_end)
    {
        // Attempt to obtain next stack pointer
        // The link address should come immediately after
        const uint32_t possible_frame_address = *stack_pointer;
        const uint32_t possible_link_address = *(stack_pointer+1);

        // Have we searched past the allowable size of a given stack?
        if(stack_entry_counter > PLATFORM_MAX_STACK_SIZE/4)
        {
            // yes, so just quite
            break;
        }
        // Next check that the frame addresss (i.e. stack pointer for the function)
        // and Link address are within an acceptable range
        else if((possible_frame_address > previous_frame_address) &&
                ((possible_frame_address < previous_frame_address + PLATFORM_MAX_STACK_SIZE)) &&
               ((possible_link_address  & 0x01) != 0) && // in THUMB mode the address will be odd
                (possible_link_address > PLATFORM_CODE_SPACE_START_ADDRESS &&
                 possible_link_address < PLATFORM_CODE_SPACE_END_ADDRESS))
        {
            // We found two acceptable values

            // Store the link address
            *caller_list++ = possible_link_address;

            // Update the book-keeping registers for the next search
            previous_frame_address = possible_frame_address;
            stack_pointer = (uint32_t*)(possible_frame_address + 4);
            stack_entry_counter = 0;
        }
        else
        {
            // Keep iterating through the stack until be find an acceptable combination
            ++stack_pointer;
            ++stack_entry_counter;
        }
    }

}

您需要为您的平台更新#defines。

然后调用以下命令以使用当前调用堆栈填充缓冲区：

uint32_t callers[8];
uint32_t sp_reg;
__ASM volatile ("mov %0, sp" : "=r" (sp_reg) );
backtrace(callers, &callers[8], (uint32_t*)sp_reg);

同样，这相当hacky，但我发现它工作得很好。
缓冲区将填充调用堆栈中每个函数调用的链接地址。

This is hacky, but I've found it works good enough considering the amount of code/RAM space required:

Assuming you're using ARM THUMB mode, compile with the following options:

-mtpcs-frame -mtpcs-leaf-frame  -fno-omit-frame-pointer

The following function is used to retrieve the callstack. Refer to the comments for more info:

/*
 * This should be compiled with:
 *  -mtpcs-frame -mtpcs-leaf-frame  -fno-omit-frame-pointer
 *
 *  With these options, the Stack pointer is automatically pushed to the stack
 *  at the beginning of each function.
 *
 *  This function basically iterates through the current stack finding the following combination of values:
 *  - <Frame Address>
 *  - <Link Address>
 *
 *  This combination will occur for each function in the call stack
 */
static void backtrace(uint32_t *caller_list, const uint32_t *caller_list_end, const uint32_t *stack_pointer)
{
    uint32_t previous_frame_address = (uint32_t)stack_pointer;
    uint32_t stack_entry_counter = 0;

    // be sure to clear the caller_list buffer
    memset(caller_list, 0, caller_list_end-caller_list);

    // loop until the buffer is full
    while(caller_list < caller_list_end)
    {
        // Attempt to obtain next stack pointer
        // The link address should come immediately after
        const uint32_t possible_frame_address = *stack_pointer;
        const uint32_t possible_link_address = *(stack_pointer+1);

        // Have we searched past the allowable size of a given stack?
        if(stack_entry_counter > PLATFORM_MAX_STACK_SIZE/4)
        {
            // yes, so just quite
            break;
        }
        // Next check that the frame addresss (i.e. stack pointer for the function)
        // and Link address are within an acceptable range
        else if((possible_frame_address > previous_frame_address) &&
                ((possible_frame_address < previous_frame_address + PLATFORM_MAX_STACK_SIZE)) &&
               ((possible_link_address  & 0x01) != 0) && // in THUMB mode the address will be odd
                (possible_link_address > PLATFORM_CODE_SPACE_START_ADDRESS &&
                 possible_link_address < PLATFORM_CODE_SPACE_END_ADDRESS))
        {
            // We found two acceptable values

            // Store the link address
            *caller_list++ = possible_link_address;

            // Update the book-keeping registers for the next search
            previous_frame_address = possible_frame_address;
            stack_pointer = (uint32_t*)(possible_frame_address + 4);
            stack_entry_counter = 0;
        }
        else
        {
            // Keep iterating through the stack until be find an acceptable combination
            ++stack_pointer;
            ++stack_entry_counter;
        }
    }

}

You'll need to update #defines for your platform.

Then call the following to populate a buffer with the current call stack:

uint32_t callers[8];
uint32_t sp_reg;
__ASM volatile ("mov %0, sp" : "=r" (sp_reg) );
backtrace(callers, &callers[8], (uint32_t*)sp_reg);

Again, this is rather hacky, but I've found it to work quite well.
The buffer will be populated with link addresses of each function call in the call stack.

回复收藏 0 原文