在 c 中使用 strtok() 将字符串标记两次

发布于 2024-10-09 13:19:45 字数 523 浏览 8 评论 0原文

我在 c 中使用 strtok() 来解析 csv 字符串。首先，我对其进行标记以找出有多少标记，以便我可以分配正确大小的字符串。然后我使用上次用于标记化的相同变量。每次我第二次这样做时，尽管还有更多标记需要解析，但它 strtok(NULL, ",") 返回 NULL 。有人可以告诉我我做错了什么吗？

char* tok;
int count = 0;
tok = strtok(buffer, ",");
while(tok != NULL) {
    count++;
    tok = strtok(NULL, ",");
}

//allocate array

tok = strtok(buffer, ",");
while(tok != NULL) {
    //do other stuff
    tok = strtok(NULL, ",");
}

因此，在第二个 while 循环中，即使有更多令牌，它总是在找到第一个令牌后结束。有人知道我做错了什么吗？

原文

I'm using strtok() in c to parse a csv string. First I tokenize it to just find out how many tokens there are so I can allocate a string of the correct size. Then I go through using the same variable I used last time for tokenization. Every time I do it a second time though it strtok(NULL, ",") returns NULL even though there are still more tokens to parse. Can somebody tell me what I'm doing wrong?

char* tok;
int count = 0;
tok = strtok(buffer, ",");
while(tok != NULL) {
    count++;
    tok = strtok(NULL, ",");
}

//allocate array

tok = strtok(buffer, ",");
while(tok != NULL) {
    //do other stuff
    tok = strtok(NULL, ",");
}

So on that second while loop it always ends after the first token is found even though there are more tokens. Does anybody know what I'm doing wrong?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

风筝有风，海豚有海 2024-10-16 13:19:45

strtok() 修改其操作的字符串，用空值替换分隔符。因此，如果您想多次使用它，则必须制作一份副本。

回复收藏 0 原文

倾其所爱 2024-10-16 13:19:45

不一定需要制作副本 - strtok() 确实会修改其标记化的字符串，但在大多数情况下，这仅意味着如果您想再次处理标记，则该字符串已经标记化。

这是您的程序经过一些修改以在第一次通过后处理令牌：

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main()
{
    int i;
    char buffer[] = "some, string with  ,  tokens";

    char* tok;
    int count = 0;
    tok = strtok(buffer, ",");
    while(tok != NULL) {
        count++;
        tok = strtok(NULL, ",");
    }


    // walk through the tokenized buffer again
    tok = buffer;

    for (i = 0; i < count; ++i) {
        printf( "token %d: \"%s\"\n", i+1, tok);
        tok += strlen(tok) + 1;  // get the next token by skipping past the '\0'
        tok += strspn(tok, ","); //   then skipping any starting delimiters
    }

     return 0;
  }

请注意，不幸的是，这比我第一次发布的更棘手 - 对 strspn() 的调用需要在跳过 '\0' 后执行由 strtok() 放置，因为 strtok() 将跳过其返回的标记的任何前导分隔符（而不替换源中的分隔符）。

There's not necessarily a need to make a copy - strtok() does modify the string it's tokenizing, but in most cases that simply means the string is already tokenized if you want to deal with the tokens again.

Here's your program modified a bit to process the tokens after your first pass:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main()
{
    int i;
    char buffer[] = "some, string with  ,  tokens";

    char* tok;
    int count = 0;
    tok = strtok(buffer, ",");
    while(tok != NULL) {
        count++;
        tok = strtok(NULL, ",");
    }


    // walk through the tokenized buffer again
    tok = buffer;

    for (i = 0; i < count; ++i) {
        printf( "token %d: \"%s\"\n", i+1, tok);
        tok += strlen(tok) + 1;  // get the next token by skipping past the '\0'
        tok += strspn(tok, ","); //   then skipping any starting delimiters
    }

     return 0;
  }

Note that this is unfortunately trickier than I first posted - the call to strspn() needs to be performed after skipping the '\0' placed by strtok() since strtok() will skip any leading delimiter characters for the token it returns (without replacing the delimiter character in the source).

回复收藏 0 原文