C:从分隔源字符串创建字符串数组
在 C(而不是 C++)中将分隔字符串转换为字符串数组的有效方法是什么?例如,我可能有:
char *input = "valgrind --leak-check=yes --track-origins=yes ./a.out"
源字符串始终只有一个空格作为分隔符。我想要一个 malloc'ed 字符串数组 char *myarray[]
这样:
myarray[0]=="valgrind"
myarray[1]=="--leak-check=yes"
...
编辑 我必须假设有任意数量的令牌inputString
所以我不能只将其限制为 10 或其他。
我尝试使用 strtok 和我实现的链接列表尝试一个混乱的解决方案,但 valgrind 抱怨太多,所以我放弃了。
(如果您想知道,这是我正在尝试编写的基本 Unix shell。)
What would be an efficient way of converting a delimited string into an array of strings in C (not C++)? For example, I might have:
char *input = "valgrind --leak-check=yes --track-origins=yes ./a.out"
The source string will always have only a single space as the delimiter. And I would like a malloc'ed array of malloc'ed strings char *myarray[]
such that:
myarray[0]=="valgrind"
myarray[1]=="--leak-check=yes"
...
Edit I have to assume that there are an arbitrary number of tokens in the inputString
so I can't just limit it to 10 or something.
I've attempted a messy solution with strtok
and a linked list I've implemented, but valgrind complained so much that I gave up.
(If you're wondering, this is for a basic Unix shell I'm trying to write.)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
类似这样的事情是怎么回事:
What's about something like:
如果您一开始就拥有
input
中的所有输入,那么您的令牌永远不会多于strlen(input)
。如果您不允许“”作为令牌,那么您永远不能拥有超过strlen(input)/2
令牌。因此,除非输入
很大,否则您可以安全地编写。作为进一步的优化,您可以保留
input
并仅用 \0 替换空格,并将指针放入input
缓冲区中并放入 myarray[] 中。不需要为每个令牌单独分配内存,除非由于某种原因您需要单独释放它们。if you have all of the input in
input
to begin with then you can never have more tokens thanstrlen(input)
. If you don't allow "" as a token, then you can never have more thanstrlen(input)/2
tokens. So unlessinput
is huge you can safely write.As a further optimization, you can keep
input
around and just replace spaces with \0 and put pointers into theinput
buffer into myarray[]. No need for a separate malloc for each token unless for some reason you need to free them individually.您是否记得为标记字符串结尾的终止 null 分配一个额外的字节?
Were you remembering to malloc an extra byte for the terminating null that marks the end of string?
来自 OSX 上的
strsep(3)
联机帮助页:针对任意标记数量进行编辑:
或类似的内容。上面的方法可能行不通,但如果行不通,也相差不远了。构建链表比不断调用
realloc
更有效,但这实际上不是重点 - 重点是如何最好地利用strsep
。From the
strsep(3)
manpage on OSX:Edited for arbitrary # of tokens:
Or something close to that. The above may not work, but if not it's not far off. Building a linked list would be more efficient than continually calling
realloc
, but that's really besides the point - the point is how best to make use ofstrsep
.看看其他答案,对于 C 初学者来说,由于代码的紧凑,它看起来会很复杂,我想我会把它放在初学者身上,实际解析字符串而不是使用
可能更容易strtok...类似这样:
我稍微修改了代码以使其更容易。我使用的唯一字符串函数是 strncpy 。当然它有点啰嗦,但它确实动态地重新分配字符串数组,而不是使用硬编码的 MAX_ARGS,这意味着双指针当只有 3 或 4 个就已经占用内存了,这也会使内存使用高效且很小,通过使用
realloc
,简单的解析通过使用isspace
来覆盖,因为它使用指针进行迭代。当遇到空格时,它会重新分配
双指针,并malloc
偏移量来保存字符串。请注意如何在 resizeptr 函数中使用三重指针。事实上,我认为这将是一个简单 C 程序、指针、realloc、malloc、按引用传递、基本的绝佳示例。解析字符串的元素...
希望这有帮助,
此致,
汤姆.
Looking at the other answers, for a beginner in C, it would look complex due to the tight size of code, I thought I would put this in for a beginner, it might be easier to actually parse the string instead of using
strtok
...something like this:I slightly modified the code to make it easier. The only string function that I used was
strncpy
..sure it is a bit long-winded but it does reallocate the array of strings dynamically instead of using a hard-coded MAX_ARGS, which means that the double pointer is already hogging up memory when only 3 or 4 would do, also which would make the memory usage efficient and tiny, by usingrealloc
, the simple parsing is covered by employingisspace
, as it iterates using the pointer. When a space is encountered, itrealloc
ates the double pointer, andmalloc
the offset to hold the string.Notice how the triple pointers are used in the
resizeptr
function.. in fact, I thought this would serve an excellent example of a simple C program, pointers, realloc, malloc, passing-by-reference, basic element of parsing a string...Hope this helps,
Best regards,
Tom.