需要使用 strtok() 知道两个标记分隔符之间何时没有数据出现
我正在尝试标记一个字符串,但我需要确切地知道两个标记之间何时看不到数据。例如,当标记以下字符串“a,b,c,,,d,e
”时,我需要了解“d
”和“”之间的两个空槽>e
'...我无法仅使用 strtok()
找到它。我的尝试如下所示:
char arr_fields[num_of_fields];
char delim[]=",\n";
char *tok;
tok=strtok(line,delim);//line contains the data
for(i=0;i<num_of_fields;i++,tok=strtok(NULL,delim))
{
if(tok)
sprintf(arr_fields[i], "%s", tok);
else
sprintf(arr_fields[i], "%s", "-");
}
使用上述示例执行上述代码将字符 a、b、c、d、e 放入 arr_fields
的前五个元素中,这是不可取的。我需要每个字符的位置进入数组的特定索引:即如果两个字符之间缺少一个字符,则应按原样记录。
I am trying to tokenize a string but I need to know exactly when no data is seen between two tokens. e.g when tokenizing the following string "a,b,c,,,d,e
" I need to know about the two empty slots between 'd
' and 'e
'... which I am unable to find out simply using strtok()
. My attempt is shown below:
char arr_fields[num_of_fields];
char delim[]=",\n";
char *tok;
tok=strtok(line,delim);//line contains the data
for(i=0;i<num_of_fields;i++,tok=strtok(NULL,delim))
{
if(tok)
sprintf(arr_fields[i], "%s", tok);
else
sprintf(arr_fields[i], "%s", "-");
}
Executing the above code with the aforementioned examples put characters a,b,c,d,e into first five elements of arr_fields
which is not desirable. I need the position of each character to go in specific indexes of array: i.e if there is a character missing between two characters, it should be recorded as is.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(6)
7.21.5.8 strtok 函数
该标准对
strtok
进行了如下规定:在上面的引用中,我们可以看到您不能使用
strtok
作为您的特定问题的解决方案,因为它会将delims
中找到的任何连续字符视为单个< /strong> 令牌。我是否注定要默默哭泣,或者有人可以帮助我吗?
您可以轻松实现您自己的
strtok
版本来执行您想要的操作,请参阅本文末尾的代码片段。strtok_single
使用strpbrk (char const* src, const char* delims)
它将返回一个指向 delims 中第一次出现的任何字符的指针 在以 null 结尾的字符串 src 中找到。如果没有找到匹配的字符,该函数将返回 NULL。
strtok_single
示例使用
输出
7.21.5.8 the strtok function
The standard says the following regarding
strtok
:In the above quote we can read you cannot use
strtok
as a solution to your specific problem, since it will treat any sequential characters found indelims
as a single token.Am I doomed to weep in silence, or can somebody help me out?
You can easily implement your own version of
strtok
that does what you want, see the snippets at the end of this post.strtok_single
makes use ofstrpbrk (char const* src, const char* delims)
which will return a pointer to the first occurrence of any character in delims that is found in the null-terminated string src.If no matching character is found the function will return NULL.
strtok_single
sample use
output
如果您想要的话,则不能使用
strtok()
。从手册页:因此,在您的示例中,它只会从
c
跳转到d
。您将必须手动解析字符串,或者可能搜索 CSV 解析库,这将使您的生活更轻松。
You can't use
strtok()
if that's what you want. From the man page:Therefore it is just going to jump from
c
tod
in your example.You're going to have to parse the string manually or perhaps search for a CSV parsing library that would make your life easier.
最近我在寻找同一问题的解决方案并找到了这个线程。
您可以使用
strsep()
。从手册中:
Lately I was looking for a solution to the same problem and found this thread.
You can use
strsep()
.From the manual:
正如这个答案中提到的,您需要自己实现类似
strtok
的东西。我更喜欢使用strcspn
(而不是strpbrk
),因为它允许更少的if
语句:As mentioned in this answer, you'll want to implement something like
strtok
yourself. I prefer usingstrcspn
(as opposed tostrpbrk
), as it allows for fewerif
statements:您可以尝试使用
strchr
找出,
符号的位置。手动将字符串标记为您找到的标记(使用memcpy
或strncpy
),然后再次使用 strchr。您将能够通过这种方式查看两个或多个逗号是否彼此相邻(strchr 将返回其减法等于 1 的数字),并且您可以编写一个if
语句来处理这种情况。You could try using
strchr
to find out the locations of the,
symbols. Tokenize manually your string up to the token you found (usingmemcpy
orstrncpy
) and then use again strchr. You will be able to see if two or more commas are next to each other this way (strchr will return numbers that their subtraction will equal 1) and you can write anif
statement to handle that case.