有趣的 strcmp 实施失败。 (三)

发布于 2024-12-10 09:28:52 字数 1288 浏览 1 评论 0原文

我正在做一个小项目,我无法访问任何C标准库。(从头开始构建ARM结构的微内核。甚至必须实现printf)

在这种情况下,我使用Duff的机器方法实现了strcmp。

以下是整个代码。

int
strcmp ( const char *str1, const char *str2 )
{
   while ( *str1 || *str2 )
       if ( *(str1++) != *(str2++) ) return *str1 - *str2;
   return 0;
}

这是有道理的;在一段时间内,它似乎可以在测试用例上工作,直到发生终端系统故障。我追查下去,来到了这个strcmp。

起初我认为它首先递增 str1,然后在 str2 递增之前与 str2 进行比较。 1.事实证明并非如此,但有人可以验证一下在某些情况下会发生这种情况吗?

然后我发现问题出在 *str1 - *str2 中,因此将其更改为返回 1。即,结果代码如下如下:

   while ( *str1 || *str2 )
       if ( *(str1++) != *(str2++) ) return 1;
   return 0;

虽然我想要的只是一个“等于”检查,所以更改为“1”没有问题,但我仍然想知道为什么原始代码失败。 2.有人可以给出一个关于它如何失败的提示或建议吗?我宁愿希望 strcmp 遵循标准 C 接口,它返回一个非零值,该值可以告诉更多有关 str1 和 str2 的信息。

测试用例是:

code_t // a function pointer type
program_find ( char *program )
{
if (strcmp( program, "exit" ) == 0) return ....
else if (strcmp( program, "k1" ) == 0) return ....
else if (strcmp( program, "k3" ) == 0) return ....
else if (strcmp( program, "perf" ) == 0) return ....
else if (strcmp( program, "test_libc" ) == 0) return ....
}

当 *program 为“k3”时,它返回“k1”,“test_libc”返回“perf”。

原来的问题是通过给它“return 1”解决的,所以这个问题纯粹是为了C的利益。 也欢迎提供 strcmp 文档的建议或链接。我看过 IEEE 的规范接口

I am working on a small project which I have no access to any C standard library.( building a microkernel in ARM structure from the scratch. Even printf had to be implemented )

Under this circumstance I implemented strcmp, using Duff's machine methodology.

the following is the entire code.

int
strcmp ( const char *str1, const char *str2 )
{
   while ( *str1 || *str2 )
       if ( *(str1++) != *(str2++) ) return *str1 - *str2;
   return 0;
}

It made sense; and for some while it seemed to work on test cases, until an end-system fail happened. I traced down and it came to this strcmp.

At first I thought it incremented str1 first then compared against str2 BEFORE str2 incremented. 1. It turned out not to be but could anyone please verify that it can happen in some cases?

I then figured that the problem was in *str1 - *str2 so changed it to return 1. i.e., the resulting code is as follows:

   while ( *str1 || *str2 )
       if ( *(str1++) != *(str2++) ) return 1;
   return 0;

while all I wanted was an 'equals' check, so changing into '1' bore no problem, but I still wonder why the original code failed. 2. Could someone give a light or a suggestion as to how it could have failed? I would rather want the strcmp to follow the standard C interface that it returns a non-zero value that tells more about the str1 and str2.

the test cases were:

code_t // a function pointer type
program_find ( char *program )
{
if (strcmp( program, "exit" ) == 0) return ....
else if (strcmp( program, "k1" ) == 0) return ....
else if (strcmp( program, "k3" ) == 0) return ....
else if (strcmp( program, "perf" ) == 0) return ....
else if (strcmp( program, "test_libc" ) == 0) return ....
}

when *program was "k3" it returned "k1", and "test_libc" returned "perf".

The original problem was solved by giving it "return 1", so this question is purely for C interests.
A suggestion or linkage to strcmp documentation is also welcome. I have seen the spec interface for IEEE

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

孤者何惧 2024-12-17 09:28:52

进行比较时,您在 str1 和 str2 上使用后增量。这会导致它们在进行减法之前递增,因此您减去了错误的两个字符。

更好的实现是

int
strcmp ( const char *str1, const char *str2 )
{
   while ( *str1 || *str2 ) {
       if ( *str1 != *str2 ) return *str1 - *str2;
       ++str1;
       ++str2;
   }
   return 0;
}

You are using a post increment on str1 and str2 when doing the comparison. This causes them to be incremented before doing the subtraction, so you are subtracting the wrong two characters.

A better implementation would be

int
strcmp ( const char *str1, const char *str2 )
{
   while ( *str1 || *str2 ) {
       if ( *str1 != *str2 ) return *str1 - *str2;
       ++str1;
       ++str2;
   }
   return 0;
}
北斗星光 2024-12-17 09:28:52

您有两个问题:

  • 您在对返回值执行减法之前递增指针,因此返回值不正确;
  • strcmp() 的特定标准指示将字符串元素作为 unsigned char 进行比较。

解决这些问题:

int
strcmp ( const char *str1, const char *str2 )
{
    const unsigned char *s1 = (const unsigned char *)str1;
    const unsigned char *s2 = (const unsigned char *)str2;

    while (*s1 && *s1 == *s2) {
        s1++;
        s2++;
    }

    return *s1 - *s2;
}

You have two problems:

  • You increment the pointers before performing the subtraction for the return value, so the return value is not correct;
  • The standard specific for strcmp() indicates that the elements of the strings are compared as unsigned char.

Fixing these problems:

int
strcmp ( const char *str1, const char *str2 )
{
    const unsigned char *s1 = (const unsigned char *)str1;
    const unsigned char *s2 = (const unsigned char *)str2;

    while (*s1 && *s1 == *s2) {
        s1++;
        s2++;
    }

    return *s1 - *s2;
}
み青杉依旧 2024-12-17 09:28:52

计算表达式:

*(str1++) != *(str2++)

将取消引用指针 str1str2,比较结果,然后递增两个指针。当 strcmp 返回时,它们现在指向的内容与您所比较的内容不同。

请记住,实现 strcmp 始终返回 1 或 0 将使其对于字符串列表排序毫无用处!您需要返回 -1/0/+1 才能使其可用。

Evaluating the expression:

*(str1++) != *(str2++)

Will dereference the pointers str1 and str2, compare the results, then increment both pointers. By the point that the strcmp returns, they're now pointing to something different than what you compared.

Keep in mind that implementing strcmp to always return 1 or 0 will make it useless for sorting a list of strings! You need to return -1/0/+1 to make it usable for that.

半衾梦 2024-12-17 09:28:52
int strcmp(const char* a, const char* b){
    for(;;++a,++b){
        if(*a == '\0' || *b == '\0')
            return (*a == *b)? 0 : *a != '\0' ? 1 : -1;
        if(*a != *b) return (unsigned char)(*a) > (unsigned char)(*b) ? 1 : -1;
    }
}
int strcmp(const char* a, const char* b){
    for(;;++a,++b){
        if(*a == '\0' || *b == '\0')
            return (*a == *b)? 0 : *a != '\0' ? 1 : -1;
        if(*a != *b) return (unsigned char)(*a) > (unsigned char)(*b) ? 1 : -1;
    }
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文