从 fgets() 输入中删除尾随换行符
我试图从用户那里获取一些数据并将其发送到 gcc 中的另一个函数。代码是这样的。
printf("Enter your Name: ");
if (!(fgets(Name, sizeof Name, stdin) != NULL)) {
fprintf(stderr, "Error reading Name.\n");
exit(1);
}
但是,我发现它最后有一个换行符 \n
。因此,如果我输入 John
,它最终会发送 John\n
。如何删除该 \n
并发送正确的字符串。
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(15)
也许最简单的解决方案使用我最喜欢的鲜为人知的函数之一,
strcspn()
:如果您希望它也处理
'\r'
(例如,如果流是二进制的):该函数计算字符数,直到遇到
'\r'
或'\n'
(换句话说,它找到第一个'\r'< /code> 或
'\n'
)。如果没有命中任何内容,它将停在'\0'
处(返回字符串的长度)。请注意,即使没有换行符,此方法也能正常工作,因为
strcspn
在'\0'
处停止。在这种情况下,整行只需将'\0'
替换为'\0'
。Perhaps the simplest solution uses one of my favorite little-known functions,
strcspn()
:If you want it to also handle
'\r'
(say, if the stream is binary):The function counts the number of characters until it hits a
'\r'
or a'\n'
(in other words, it finds the first'\r'
or'\n'
). If it doesn't hit anything, it stops at the'\0'
(returning the length of the string).Note that this works fine even if there is no newline, because
strcspn
stops at a'\0'
. In that case, the entire line is simply replacing'\0'
with'\0'
.优雅的方式:
稍微丑陋的方式:
稍微奇怪的方式:
请注意,如果用户输入空字符串(即仅按 Enter 键),strtok 函数将无法按预期工作。它使
\n
字符保持不变。当然还有其他的。
The elegant way:
The slightly ugly way:
The slightly strange way:
Note that the
strtok
function doesn't work as expected if the user enters an empty string (i.e. presses only Enter). It leaves the\n
character intact.There are others as well, of course.
下面是从
fgets()
保存的字符串中删除潜在的'\n'
的快速方法。它使用
strlen()
,并进行 2 个测试。现在根据需要使用
buffer
和len
。此方法的另一个好处是为后续代码提供了
len
值。它可以轻松地比strchr(Name, '\n')
更快。 参考 YMMV,但这两种方法都有效。原始
fgets()
中的buffer
在某些情况下不会包含在"\n"
中:A) 该行对于
buffer
来说太长,因此只有'\n'
之前的char
保存在buffer
中。未读字符保留在流中。B) 文件中的最后一行没有以
'\n'
结尾。如果输入在某处嵌入了空字符
'\0'
,则strlen()
报告的长度将不包括'\n'
地点。其他一些答案的问题:
strtok(buffer, "\n");
当buffer
时无法删除'\n'
是“\n”
。来自此答案 - 在此答案后进行修改以警告此限制。当
fgets()
读取的第一个char
为'\0'
时,以下操作在极少数情况下会失败。当输入以嵌入的'\0'
开头时,就会发生这种情况。然后buffer[len -1]
变成buffer[SIZE_MAX]
访问内存肯定超出了buffer
的合法范围。黑客可能会在愚蠢地读取 UTF16 文本文件时尝试或发现一些东西。这是编写此答案时答案的状态。后来,一位非操作员对其进行了编辑,以包含类似此答案检查""
的代码。sprintf(buffer,"%s",buffer);
是未定义的行为:参考。此外,它不保存任何前导、分隔或尾随空格。现在已删除。[稍后编辑答案] 1个衬垫没有问题
buffer[strcspn( buffer, "\n")] = 0;
与strlen()
方法相比,性能除外。考虑到代码正在执行 I/O(CPU 时间的黑洞),修剪的性能通常不是问题。如果以下代码需要字符串的长度或高度关注性能,请使用此strlen()
方法。否则,strcspn()
是一个不错的选择。Below is a fast approach to remove a potential
'\n'
from a string saved byfgets()
.It uses
strlen()
, with 2 tests.Now use
buffer
andlen
as needed.This method has the side benefit of a
len
value for subsequent code. It can be easily faster thanstrchr(Name, '\n')
. Ref YMMV, but both methods work.buffer
, from the originalfgets()
will not contain in"\n"
under some circumstances:A) The line was too long for
buffer
so onlychar
preceding the'\n'
is saved inbuffer
. The unread characters remain in the stream.B) The last line in the file did not end with a
'\n'
.If input has embedded null characters
'\0'
in it somewhere, the length reported bystrlen()
will not include the'\n'
location.Some other answers' issues:
strtok(buffer, "\n");
fails to remove the'\n'
whenbuffer
is"\n"
. From this answer - amended after this answer to warn of this limitation.The following fails on rare occasions when the first
char
read byfgets()
is'\0'
. This happens when input begins with an embedded'\0'
. Thenbuffer[len -1]
becomesbuffer[SIZE_MAX]
accessing memory certainly outside the legitimate range ofbuffer
. Something a hacker may try or found in foolishly reading UTF16 text files. This was the state of an answer when this answer was written. Later a non-OP edited it to include code like this answer's check for""
.sprintf(buffer,"%s",buffer);
is undefined behavior: Ref. Further, it does not save any leading, separating or trailing whitespace. Now deleted.[Edit due to good later answer] There are no problems with the 1 liner
buffer[strcspn(buffer, "\n")] = 0;
other than performance as compared to thestrlen()
approach. Performance in trimming is usually not an issue given code is doing I/O - a black hole of CPU time. Should following code need the string's length or is highly performance conscious, use thisstrlen()
approach. Else thestrcspn()
is a fine alternative.如果每行都有 '\n',则直接从 fgets 输出中删除 '\n'
否则:
Direct to remove the '\n' from the fgets output if every line has '\n'
Otherwise:
对于单个 '\n' 修剪,
对于多个 '\n' 修剪,
For single '\n' trimming,
for multiple '\n' trimming,
我的新手方式;-) 请告诉我这是否正确。它似乎适用于我的所有案例:
My Newbie way ;-) Please let me know if that's correct. It seems to be working for all my cases:
以最明显的方式删除换行符的步骤:
strlen()
确定NAME
内字符串的长度,头string.h< /代码>。请注意,
strlen()
不计算终止\0
。\0
字符开头或仅包含一个\0
字符(空字符串)。在这种情况下,sl
将是0
,因为正如我上面所说的strlen()
不计算\0
并在第一次出现时停止:'\n'
。如果是这种情况,请将\n
替换为\0
。请注意,索引计数从0
开始,因此我们需要执行NAME[sl - 1]
:请注意,如果您仅在
fgets()
处按 Enter 键code> 字符串请求(字符串内容仅由换行符组成),此后NAME
中的字符串将为空字符串。&&
将步骤 2. 和 3. 组合在一个if
语句中:如果您更喜欢函数为了通过一般处理
fgets
输出字符串来使用此技术,而无需每次都重新输入,这里是fgets_newline_kill
:在您提供的示例中,它将是:
请注意,此方法确实如果输入字符串中嵌入了
\0
则不起作用。如果是这种情况,strlen()
将仅返回第一个\0
之前的字符数。但这并不是一种常见的方法,因为大多数字符串读取函数通常会在第一个\0
处停止,并获取字符串直到该空字符。除了问题本身。尽量避免使代码不清楚的双重否定:
if (!(fgets(Name, sizeof Name, stdin) != NULL) {}
。您可以简单地执行if (fgets(Name) , sizeof 名称, stdin) == NULL) {}
。The steps to remove the newline character in the perhaps most obvious way:
NAME
by usingstrlen()
, headerstring.h
. Note thatstrlen()
does not count the terminating\0
.\0
character (empty string). In this casesl
would be0
sincestrlen()
as I said above doesn´t count the\0
and stops at the first occurrence of it:'\n'
. If this is the case, replace\n
with a\0
. Note that index counts start at0
so we will need to doNAME[sl - 1]
:Note if you only pressed Enter at the
fgets()
string request (the string content was only consisted of a newline character) the string inNAME
will be an empty string thereafter.if
-statement by using the logic operator&&
:If you rather like a function for use this technique by handling
fgets
output strings in general without retyping each and every time, here isfgets_newline_kill
:In your provided example, it would be:
Note that this method does not work if the input string has embedded
\0
s in it. If that would be the casestrlen()
would only return the amount of characters until the first\0
. But this isn´t quite a common approach, since the most string-reading functions usually stop at the first\0
and take the string until that null character.Aside from the question on its own. Try to avoid double negations that make your code unclearer:
if (!(fgets(Name, sizeof Name, stdin) != NULL) {}
. You can simply doif (fgets(Name, sizeof Name, stdin) == NULL) {}
.一般来说,与其修剪不需要的数据,不如从一开始就避免写入数据。如果您不希望缓冲区中出现换行符,请不要使用 fgets。请改用
getc
或fgetc
或scanf
。也许是这样的:请注意,这种特殊方法将使换行符保持未读状态,因此您可能需要使用像
"%255[^\n]%*c"
这样的格式字符串来丢弃它(例如,sprintf(fmt, "%%%zd[^\n]%%*c", sizeof Name - 1);
),或者在 scanf 后面加上getchar()
代码>.In general, rather than trimming data that you don't want, avoid writing it in the first place. If you don't want the newline in the buffer, don't use fgets. Instead, use
getc
orfgetc
orscanf
. Perhaps something like:Note that this particular approach will leave the newline unread, so you may want to use a format string like
"%255[^\n]%*c"
to discard it (eg,sprintf(fmt, "%%%zd[^\n]%%*c", sizeof Name - 1);
), or perhaps follow the scanf with agetchar()
.如果使用 POSIX
getline()
是一个选项 - 不忽略它的安全问题,并且如果您希望使用大括号指针 - 您可以避免字符串函数,因为 getline 返回字符数。如下所示:注意:[不过,不应忽视
getline
的安全问题]。If using POSIX
getline()
is an option - Not neglecting its security issues and if you wish to brace pointers - you can avoid string functions as thegetline
returns the number of characters. Something like below:Note: The [ security issues ] with
getline
shouldn't be neglected though.扩展 @Jerry Coffin 和 @Tim Čas 的答案:
strchr
版本在设计上比strcspn
(和strlen
版本)快得多可能是最快的)。strcspn
的内部结构必须遍历"\n"
字符串,如果合理实现,它只会执行一次并将字符串长度存储在某处。然后在搜索时,它还必须使用嵌套的 for 循环来遍历"\n"
字符串。忽略这些函数的库质量实现会考虑的字长等因素,简单的实现可能如下所示:
对于
strchr
,每个字符有两个分支。一个搜索空终止符,另一个将当前字符与搜索到的字符进行比较。对于
strcspn
,它要么必须像我的示例中那样预先计算s2
大小,要么在查找 null 以及搜索时迭代它钥匙。后者本质上就是strchr
所做的,因此内部循环可以用strchr
替换。无论我们如何实现,都会有很多额外的分支。细心的语言律师也可能会发现
strcspn
标准库定义中缺少restrict
。这意味着编译器不允许假设s1
和s2
是不同的字符串。这也会阻止一些优化。strlen
版本将比这两个版本更快,因为strlen
只需要检查 null 终止而无需其他任何操作。尽管正如 @chux - Reinstate Monica 的答案中提到的,但在某些情况下它不起作用,因此它比其他版本稍微脆弱一些。问题的根源在于
fgets
函数的 API 不好 - 如果它在过去实现得更好,它会返回与实际读取的字符数相对应的大小,这将导致太棒了。或者,指向最后一个字符的指针,如strchr
所示。相反,标准库通过返回指向传递的字符串中第一个字符的指针来浪费返回值,这有点有用。To expand on the answers by @Jerry Coffin and @Tim Čas:
The
strchr
version is by design much faster than thestrcspn
(andstrlen
versions are likely the fastest of all).The internals of
strcspn
has to iterate through the"\n"
string and if reasonably implemented, it only does that once and stores down the string length somewhere. Then while searching, it also has to use a nested for loop going through the"\n"
string.Ignoring things like word size that a library-quality implementation of these functions would take in account, naive implementations may look like this:
In case of
strchr
, there are two branches per character. One searching for the null terminator and other comparing the current character with the one searched for.In case of
strcspn
, it either has to pre-calculates2
size as in my example, or alternatively iterate through it while looking for null as well as the search key. The latter is essentially just whatstrchr
does, so the inner loop could have been replaced withstrchr
. No matter how we implement it, there will be a lot of extra branching.An attentive language lawyer might also spot the absence of
restrict
in thestrcspn
standard library definition. Meaning that the compiler is not allowed to assume thats1
ands2
are different strings. This blocks some optimizations too.The
strlen
version will be faster than both, sincestrlen
only needs to check for null termination and nothing else. Though as mentioned in the answer by @chux - Reinstate Monica, there are some situations where it won't work, so it is slightly more brittle than the other versions.The root of the problem is the bad API of the
fgets
function - if it had been implemented better back in the days, it would have returned a size corresponding to the number of characters actually read, which would have been great. Or alternatively a pointer to the last character read likestrchr
. Instead the standard lib wasted the return value by returning a pointer to the first character in the string passed, which is mildly useful.Tim 的一个衬垫对于通过调用 fgets 获得的字符串来说是令人惊奇的,因为您知道它们末尾包含一个换行符。
如果您处于不同的上下文中并且想要处理可能包含多个换行符的字符串,您可能需要寻找 strrspn。它不是 POSIX,这意味着您不会在所有 Unices 上找到它。我根据自己的需要写了一篇。
对于那些在 C 中寻找 Perl chomp 等效项的人来说,我认为这就是它(chomp 仅删除尾随换行符)。
strrcspn 函数:
Tim Čas one liner is amazing for strings obtained by a call to fgets, because you know they contain a single newline at the end.
If you are in a different context and want to handle strings that may contain more than one newline, you might be looking for strrspn. It is not POSIX, meaning you will not find it on all Unices. I wrote one for my own needs.
For those looking for a Perl chomp equivalent in C, I think this is it (chomp only removes the trailing newline).
The strrcspn function:
下面的函数是我在 Github 上维护的字符串处理库的一部分。它从字符串中删除不需要的字符,正是您想要的
示例用法可能是
您可能想检查其他可用的函数,甚至为该项目做出贡献:)
https://github.com/fnoyanisi/zString
The function below is a part of string processing library I am maintaining on Github. It removes and unwanted characters from a string, exactly what you want
An example usage could be
You may want to check other available functions, or even contribute to the project :)
https://github.com/fnoyanisi/zString
你应该尝试一下。该代码基本上循环遍历字符串,直到找到“\n”。当发现 '\n' 将被空字符终止符 '\0' 替换时
请注意,您正在比较此行中的字符而不是字符串,则无需使用 strcmp():
因为您将使用单引号而不是双引号。 这里有关 single 的链接如果您想了解更多信息,请与双引号比较
You should give it a try. This code basically loop through the string until it finds the '\n'. When it's found the '\n' will be replaced by the null character terminator '\0'
Note that you are comparing characters and not strings in this line, then there's no need to use strcmp():
since you will be using single quotes and not double quotes. Here's a link about single vs double quotes if you want to know more
这是我的解决方案。很简单。
This is my solution. Very simple.