除非通过 gdb 运行,否则如何调试出现段错误的代码?

发布于 2024-09-09 06:09:35 字数 1778 浏览 4 评论 0原文

这是单线程代码。

特别是:ahocorasick Python 扩展模块 (easy_install ahocorasick)。

我将问题隔离到一个简单的示例:

import ahocorasick
t = ahocorasick.KeywordTree()
t.add("a")

当我在 gdb 中运行它时,一切都很好,当我将这些指令输入 Python CLI 时也会发生同样的情况。但是,当我尝试定期运行脚本时,出现段错误。

更奇怪的是,导致段错误的行(通过核心转储分析识别)是常规的 int 增量(请参见函数体的底部)。

这一刻我彻底陷入困境,我能做什么?

int
aho_corasick_addstring(aho_corasick_t *in, unsigned char *string, size_t n)
{
    aho_corasick_t* g = in;
    aho_corasick_state_t *state,*s = NULL;
    int j = 0;

    state = g->zerostate;

    // As long as we have transitions follow them
    while( j != n &&
           (s = aho_corasick_goto_get(state,*(string+j))) != FAIL )
    {
        state = s;
        ++j;
    }

    if ( j == n ) {
        /* dyoo: added so that if a keyword ends up in a prefix
           of another, we still mark that as a match.*/
        aho_corasick_output(s) = j;
        return 0;
    }

    while( j != n )
    {
        // Create new state
        if ( (s = xalloc(sizeof(aho_corasick_state_t))) == NULL )
            return -1;
        s->id = g->newstate++;
        debug(printf("allocating state %d\n", s->id)); /* debug */ 
        s->depth = state->depth + 1;

        /* FIXME: check the error return value of
           aho_corasick_goto_initialize. */
        aho_corasick_goto_initialize(s);

        // Create transition
        aho_corasick_goto_set(state,*(string+j), s);
        debug(printf("%u -> %c -> %u\n",state->id,*(string+j),s->id));
        state = s;
        aho_corasick_output(s) = 0;
        aho_corasick_fail(s) = NULL;
        ++j;                                 // <--- HERE!
    }

    aho_corasick_output(s) = n;

    return 0;
}

That's a single threaded code.

In particular: ahocorasick Python extension module (easy_install ahocorasick).

I isolated the problem to a trivial example:

import ahocorasick
t = ahocorasick.KeywordTree()
t.add("a")

When I run it in gdb, all is fine, same happens when I enter these instructions into Python CLI. However, when I try to run the script regularily, I get a segfault.

To make it even weirder, the line that causes segfault (identified by core dump analysis) is a regular int incrementation (see the bottom of the function body).

I'm completely stuck by this moment, what can I do?

int
aho_corasick_addstring(aho_corasick_t *in, unsigned char *string, size_t n)
{
    aho_corasick_t* g = in;
    aho_corasick_state_t *state,*s = NULL;
    int j = 0;

    state = g->zerostate;

    // As long as we have transitions follow them
    while( j != n &&
           (s = aho_corasick_goto_get(state,*(string+j))) != FAIL )
    {
        state = s;
        ++j;
    }

    if ( j == n ) {
        /* dyoo: added so that if a keyword ends up in a prefix
           of another, we still mark that as a match.*/
        aho_corasick_output(s) = j;
        return 0;
    }

    while( j != n )
    {
        // Create new state
        if ( (s = xalloc(sizeof(aho_corasick_state_t))) == NULL )
            return -1;
        s->id = g->newstate++;
        debug(printf("allocating state %d\n", s->id)); /* debug */ 
        s->depth = state->depth + 1;

        /* FIXME: check the error return value of
           aho_corasick_goto_initialize. */
        aho_corasick_goto_initialize(s);

        // Create transition
        aho_corasick_goto_set(state,*(string+j), s);
        debug(printf("%u -> %c -> %u\n",state->id,*(string+j),s->id));
        state = s;
        aho_corasick_output(s) = 0;
        aho_corasick_fail(s) = NULL;
        ++j;                                 // <--- HERE!
    }

    aho_corasick_output(s) = n;

    return 0;
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

风流物 2024-09-16 06:09:35

您还可以使用其他工具来查找不一定会导致程序崩溃的错误。
valgrind电围栏净化coveritylint 之类的工具也许可以帮助您。

在某些情况下,您可能需要构建自己的 python 才能使用。另外,对于内存损坏问题,有可能(或者曾经有一段时间没有构建扩展)让 python 使用直接内存分配而不是 python 自己的内存分配。

There are other tools you can use that will find faults that does not necessarily crash the program.
valgrind, electric fence, purify, coverity, and lint-like tools may be able to help you.

You might need to build your own python in some cases for this to be usable. Also, for memory corruption things, there is (or was, haven't built exetensions in a while) a possibility to let python use direct memory allocation instead of pythons own.

网白 2024-09-16 06:09:35

您是否尝试过将 while 循环转换为 for 循环?也许对 ++j 存在一些微妙的误解,如果您使用更直观的东西,这些误解就会消失。

Have you tried translating that while loop to a for loop? Maybe there's some subtle misunderstanding with the ++j that will disappear if you use something more intuitive.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文