在 c 中使用 strtok() 将字符串标记两次

发布于 2024-10-09 13:19:45 字数 523 浏览 8 评论 0原文

我在 c 中使用 strtok() 来解析 csv 字符串。首先,我对其进行标记以找出有多少标记,以便我可以分配正确大小的字符串。然后我使用上次用于标记化的相同变量。每次我第二次这样做时,尽管还有更多标记需要解析,但它 strtok(NULL, ",") 返回 NULL 。有人可以告诉我我做错了什么吗?

char* tok;
int count = 0;
tok = strtok(buffer, ",");
while(tok != NULL) {
    count++;
    tok = strtok(NULL, ",");
}

//allocate array

tok = strtok(buffer, ",");
while(tok != NULL) {
    //do other stuff
    tok = strtok(NULL, ",");
}

因此,在第二个 while 循环中,即使有更多令牌,它总是在找到第一个令牌后结束。有人知道我做错了什么吗?

I'm using strtok() in c to parse a csv string. First I tokenize it to just find out how many tokens there are so I can allocate a string of the correct size. Then I go through using the same variable I used last time for tokenization. Every time I do it a second time though it strtok(NULL, ",") returns NULL even though there are still more tokens to parse. Can somebody tell me what I'm doing wrong?

char* tok;
int count = 0;
tok = strtok(buffer, ",");
while(tok != NULL) {
    count++;
    tok = strtok(NULL, ",");
}

//allocate array

tok = strtok(buffer, ",");
while(tok != NULL) {
    //do other stuff
    tok = strtok(NULL, ",");
}

So on that second while loop it always ends after the first token is found even though there are more tokens. Does anybody know what I'm doing wrong?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

风筝有风,海豚有海 2024-10-16 13:19:45

strtok() 修改其操作的字符串,用空值替换分隔符。因此,如果您想多次使用它,则必须制作一份副本。

strtok() modifies the string it operates on, replacing delimiter characters with nulls. So if you want to use it more than once, you'll have to make a copy.

倾其所爱 2024-10-16 13:19:45

不一定需要制作副本 - strtok() 确实会修改其标记化的字符串,但在大多数情况下,这仅意味着如果您想再次处理标记,则该字符串已经标记化。

这是您的程序经过一些修改以在第一次通过后处理令牌:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main()
{
    int i;
    char buffer[] = "some, string with  ,  tokens";

    char* tok;
    int count = 0;
    tok = strtok(buffer, ",");
    while(tok != NULL) {
        count++;
        tok = strtok(NULL, ",");
    }


    // walk through the tokenized buffer again
    tok = buffer;

    for (i = 0; i < count; ++i) {
        printf( "token %d: \"%s\"\n", i+1, tok);
        tok += strlen(tok) + 1;  // get the next token by skipping past the '\0'
        tok += strspn(tok, ","); //   then skipping any starting delimiters
    }

     return 0;
  }

请注意,不幸的是,这比我第一次发布的更棘手 - 对 strspn() 的调用需要在跳过 '\0' 后执行由 strtok() 放置,因为 strtok() 将跳过其返回的标记的任何前导分隔符(而不替换源中的分隔符)。

There's not necessarily a need to make a copy - strtok() does modify the string it's tokenizing, but in most cases that simply means the string is already tokenized if you want to deal with the tokens again.

Here's your program modified a bit to process the tokens after your first pass:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main()
{
    int i;
    char buffer[] = "some, string with  ,  tokens";

    char* tok;
    int count = 0;
    tok = strtok(buffer, ",");
    while(tok != NULL) {
        count++;
        tok = strtok(NULL, ",");
    }


    // walk through the tokenized buffer again
    tok = buffer;

    for (i = 0; i < count; ++i) {
        printf( "token %d: \"%s\"\n", i+1, tok);
        tok += strlen(tok) + 1;  // get the next token by skipping past the '\0'
        tok += strspn(tok, ","); //   then skipping any starting delimiters
    }

     return 0;
  }

Note that this is unfortunately trickier than I first posted - the call to strspn() needs to be performed after skipping the '\0' placed by strtok() since strtok() will skip any leading delimiter characters for the token it returns (without replacing the delimiter character in the source).

说谎友 2024-10-16 13:19:45

使用 strsep - 它实际上会更新您的指针。在你的情况下,你必须继续调用 NULL 而不是传入字符串的地址。 strsep 的唯一问题是,如果它之前已在堆上分配,请保留指向开头的指针,然后稍后释放它。

char *strsep(char **string, char *delim);

字符*字符串;
字符*令牌;
token = strsep(&字符串, ",");

strtok 在您的常规 C 入门课程中使用 - 使用 strsep,它要好得多。 :-)
不要对“哦该死 - 我仍然必须传入 NULL 因为 strtok 搞砸了我的定位”感到困惑。

Use strsep - it actually updates your pointer. In your case you would have to keep calling NULL versus passing in the address of your string. The only issue with strsep is if it was previously allocated on the heap, keep a pointer to the beginning and then free it later.

char *strsep(char **string, char *delim);

char *string;
char *token;
token = strsep(&string, ",");

strtok is used in your normal intro to C course - use strsep, it's much better. :-)
No getting confused on "oh shit - i have to pass in NULL still cuz strtok screwed up my positioning."

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文