用 C 语言模仿 Python 的 strip() 函数

发布于 2024-08-05 13:20:44 字数 196 浏览 7 评论 0原文

我最近开始用 C 语言开发一个小玩具项目,并且一直在摸索模仿 python 字符串对象的 strip() 功能的最佳方法。

阅读 fscanf 或 sscanf 发现字符串被处理到遇到的第一个空格。

fgets 也没有帮助,因为我仍然有换行符。 我确实尝试使用 strchr() 来搜索空格并将返回的指针显式设置为 '\0' 但这似乎不起作用。

I started on a little toy project in C lately and have been scratching my head over the best way to mimic the strip() functionality that is part of the python string objects.

Reading around for fscanf or sscanf says that the string is processed upto the first whitespace that is encountered.

fgets doesn't help either as I still have newlines sticking around.
I did try a strchr() to search for a whitespace and setting the returned pointer to '\0' explicitly but that doesn't seem to work.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

腻橙味 2024-08-12 13:20:44

Python 字符串的 strip 方法会删除尾随和前导空格。当处理 C“字符串”(字符数组,\0 终止)时,问题的两部分非常不同。

对于尾随空格:设置指向现有尾随 \0 的指针(或等效索引)。继续递减指针,直到它碰到字符串开头或任何非白色字符;将 \0 设置在此终止向后扫描点之后。

对于前导空格:设置一个指向字符串开头的指针(或等效索引);继续递增指针,直到遇到非白色字符(可能是尾随 \0); memmove 字符串的其余部分,以便第一个非白色转到字符串的开头(对于后面的所有内容也类似)。

Python strings' strip method removes both trailing and leading whitespace. The two halves of the problem are very different when working on a C "string" (array of char, \0 terminated).

For trailing whitespace: set a pointer (or equivalently index) to the existing trailing \0. Keep decrementing the pointer until it hits against the start-of-string, or any non-white character; set the \0 to right after this terminate-backwards-scan point.

For leading whitespace: set a pointer (or equivalently index) to the start of string; keep incrementing the pointer until it hits a non-white character (possibly the trailing \0); memmove the rest-of-string so that the first non-white goes to the start of string (and similarly for everything following).

黯然 2024-08-12 13:20:44

strip() 或trim() 函数没有标准的C 实现。也就是说,这是 Linux 内核中包含的一个:

char *strstrip(char *s)
{
        size_t size;
        char *end;

        size = strlen(s);

        if (!size)
                return s;

        end = s + size - 1;
        while (end >= s && isspace(*end))
                end--;
        *(end + 1) = '\0';

        while (*s && isspace(*s))
                s++;

        return s;
}

There is no standard C implementation for a strip() or trim() function. That said, here's the one included in the Linux kernel:

char *strstrip(char *s)
{
        size_t size;
        char *end;

        size = strlen(s);

        if (!size)
                return s;

        end = s + size - 1;
        while (end >= s && isspace(*end))
                end--;
        *(end + 1) = '\0';

        while (*s && isspace(*s))
                s++;

        return s;
}
李不 2024-08-12 13:20:44

如果您想就地删除一行中的最后一个换行符,您可以使用以下代码片段:

size_t s = strlen(buf);
if (s && (buf[s-1] == '\n')) buf[--s] = 0;

忠实地模仿 Python 的 str.strip([chars]) 方法 (按照我解释其工作原理的方式),您需要为新字符串分配空间,填充新字符串并返回它。之后,当您不再需要剥离的字符串时,您需要释放它以前没有内存泄漏的内存。

或者您可以使用 C 指针并修改初始字符串并获得类似的结果。
假设您的初始字符串是 "____4two____\n" 并且您想要删除所有下划线和 '\n'

____forty two___\n
^ ptr

如果您将 ptr 更改为 'f' 并替换two 之后的第一个 '_' 和 '\0' 的结果与 Python 的 "____4two____\n".strip("_\n" 相同);

____forty two\0___\n
    ^ptr

同样,这与 Python 不同。该字符串已就地修改,没有第二个字符串,并且您无法恢复更改(原始字符串丢失)。

If you want to remove, in place, the final newline on a line, you can use this snippet:

size_t s = strlen(buf);
if (s && (buf[s-1] == '\n')) buf[--s] = 0;

To faithfully mimic Python's str.strip([chars]) method (the way I interpreted its workings), you need to allocate space for a new string, fill the new string and return it. After that, when you no longer need the stripped string you need to free the memory it used to have no memory leaks.

Or you can use C pointers and modify the initial string and achieve a similar result.
Suppose your initial string is "____forty two____\n" and you want to strip all underscores and the '\n'

____forty two___\n
^ ptr

If you change ptr to the 'f' and replace the first '_' after two with a '\0' the result is the same as Python's "____forty two____\n".strip("_\n");

____forty two\0___\n
    ^ptr

Again, this is not the same as Python. The string is modified in place, there's no 2nd string and you cannot revert the changes (the original string is lost).

总以为 2024-08-12 13:20:44

我写了C代码来实现这个功能。我还编写了一些简单的测试,以确保我的函数执行合理的操作。

此函数写入您提供的缓冲区,并且永远不应该写入超过缓冲区末尾的内容,因此它不应该容易出现缓冲区溢出安全问题。

注意:只有 Test() 使用 stdio.h,因此如果您只需要该函数,则只需包含 ctype.h(对于 isspace())和 string.h(对于 strlen())。

// strstrip.c -- implement white space stripping for a string in C
//
// This code is released into the public domain.
//
// You may use it for any purpose whatsoever, and you don't need to advertise
// where you got it, but you aren't allowed to sue me for giving you free
// code; all the risk of using this is yours.



#include <ctype.h>
#include <stdio.h>
#include <string.h>



// strstrip() -- strip leading and trailing white space from a string
//
// Copies from sIn to sOut, writing at most lenOut characters.
//
// Returns number of characters in returned string, or -1 on an error.
// If you get -1 back, then nothing was written to sOut at all.

int
strstrip(char *sOut, unsigned int lenOut, char const *sIn)
{
    char const *pStart, *pEnd;
    unsigned int len;
    char *pOut;

    // if there is no room for any output, or a null pointer, return error!
    if (0 == lenOut || !sIn || !sOut)
        return -1;

    pStart = sIn;
    pEnd = sIn + strlen(sIn) - 1;

    // skip any leading whitespace
    while (*pStart && isspace(*pStart))
        ++pStart;

    // skip any trailing whitespace
    while (pEnd >= sIn && isspace(*pEnd))
        --pEnd;

    pOut = sOut;
    len = 0;

    // copy into output buffer
    while (pStart <= pEnd && len < lenOut - 1)
    {
        *pOut++ = *pStart++;
        ++len;
    }


    // ensure output buffer is properly terminated
    *pOut = '\0';
    return len;
}


void
Test(const char *s)
{
    int len;
    char buf[1024];

    len = strstrip(buf, sizeof(buf), s);

    if (!s)
        s = "**null**";  // don't ask printf to print a null string
    if (-1 == len)
        *buf = '\0';  // don't ask printf to print garbage from buf

    printf("Input: \"%s\"  Result: \"%s\" (%d chars)\n", s, buf, len);
}


main()
{
    Test(NULL);
    Test("");
    Test(" ");
    Test("    ");
    Test("x");
    Test("  x");
    Test("  x   ");
    Test("  x y z   ");
    Test("x y z");
}

I wrote C code to implement this function. I also wrote a few trivial tests to make sure my function does sensible things.

This function writes to a buffer you provide, and should never write past the end of the buffer, so it should not be prone to buffer overflow security issues.

Note: only Test() uses stdio.h, so if you just need the function, you only need to include ctype.h (for isspace()) and string.h (for strlen()).

// strstrip.c -- implement white space stripping for a string in C
//
// This code is released into the public domain.
//
// You may use it for any purpose whatsoever, and you don't need to advertise
// where you got it, but you aren't allowed to sue me for giving you free
// code; all the risk of using this is yours.



#include <ctype.h>
#include <stdio.h>
#include <string.h>



// strstrip() -- strip leading and trailing white space from a string
//
// Copies from sIn to sOut, writing at most lenOut characters.
//
// Returns number of characters in returned string, or -1 on an error.
// If you get -1 back, then nothing was written to sOut at all.

int
strstrip(char *sOut, unsigned int lenOut, char const *sIn)
{
    char const *pStart, *pEnd;
    unsigned int len;
    char *pOut;

    // if there is no room for any output, or a null pointer, return error!
    if (0 == lenOut || !sIn || !sOut)
        return -1;

    pStart = sIn;
    pEnd = sIn + strlen(sIn) - 1;

    // skip any leading whitespace
    while (*pStart && isspace(*pStart))
        ++pStart;

    // skip any trailing whitespace
    while (pEnd >= sIn && isspace(*pEnd))
        --pEnd;

    pOut = sOut;
    len = 0;

    // copy into output buffer
    while (pStart <= pEnd && len < lenOut - 1)
    {
        *pOut++ = *pStart++;
        ++len;
    }


    // ensure output buffer is properly terminated
    *pOut = '\0';
    return len;
}


void
Test(const char *s)
{
    int len;
    char buf[1024];

    len = strstrip(buf, sizeof(buf), s);

    if (!s)
        s = "**null**";  // don't ask printf to print a null string
    if (-1 == len)
        *buf = '\0';  // don't ask printf to print garbage from buf

    printf("Input: \"%s\"  Result: \"%s\" (%d chars)\n", s, buf, len);
}


main()
{
    Test(NULL);
    Test("");
    Test(" ");
    Test("    ");
    Test("x");
    Test("  x");
    Test("  x   ");
    Test("  x y z   ");
    Test("x y z");
}
短暂陪伴 2024-08-12 13:20:44

这种潜在的“解决方案”绝不像其他人提出的那样完整或彻底。这是我自己的 C 玩具项目 - 一个基于文本的冒险游戏,我正在和我 14 岁的儿子一起开发。如果您使用 fgets() 那么 strcspn() 可能也适合您。下面的示例代码是基于交互式控制台的循环的开始。

#include <stdio.h>
#include <string.h> // for strcspn()

int main(void)
{
    char input[64];
    puts("Press <q> to exit..");
    do {
        
        printf("> ");
        fgets(input,64,stdin); // fgets() captures '\n'
        input[strcspn(input, "\n")] = 0; // replaces '\n' with 0 
        if (input[0] == '\0') continue; 
        printf("You entered '%s'\n", input);
        
    } while (strcmp(input,"q")!= 0); // returns 0 (false) when input = "q"

    puts("Goodbye!");
    return 0;
}

This potential ‘solution' is by no means as complete or thorough as others have presented. This is for my own toy project in C - a text-based adventure game that I’m working on with my 14-year old son. If you’re using fgets() then strcspn() may just work for you as well. The sample code below is the beginning of an interactive console-based loop.

#include <stdio.h>
#include <string.h> // for strcspn()

int main(void)
{
    char input[64];
    puts("Press <q> to exit..");
    do {
        
        printf("> ");
        fgets(input,64,stdin); // fgets() captures '\n'
        input[strcspn(input, "\n")] = 0; // replaces '\n' with 0 
        if (input[0] == '\0') continue; 
        printf("You entered '%s'\n", input);
        
    } while (strcmp(input,"q")!= 0); // returns 0 (false) when input = "q"

    puts("Goodbye!");
    return 0;
}
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文