为什么我的字符串的开头消失了?

发布于 2024-07-25 13:42:32 字数 739 浏览 2 评论 0原文

在下面的 C++ 代码中,我意识到 gcount() 返回的数字比我想要的要大,因为 getline() 消耗了最后的换行符,但没有发送它到输入流。

不过,我仍然不明白的是程序的输出。 对于输入“Test\n”,为什么我得到“est\n”? 为什么我的错误会影响字符串的第一个字符而不是在末尾添加不需要的垃圾? 为什么程序的输出与调试器中字符串的显示方式不一致(“Test\n”,正如我所期望的)?

#include <fstream>
#include <vector>
#include <string>
#include <iostream>

using namespace std;

int main()
{
    const int bufferSize = 1024;
    ifstream input( "test.txt", ios::in | ios::binary );

    vector<char> vecBuffer( bufferSize );
    input.getline( &vecBuffer[0], bufferSize );
    string strResult( vecBuffer.begin(), vecBuffer.begin() + input.gcount() );
    cout << strResult << "\n";

    return 0;
}

In the following C++ code, I realised that gcount() was returning a larger number than I wanted, because getline() consumes the final newline character but doesn't send it to the input stream.

What I still don't understand is the program's output, though. For input "Test\n", why do I get " est\n"? How come my mistake affects the first character of the string rather than adding unwanted rubbish onto the end? And how come the program's output is at odds with the way the string looks in the debugger ("Test\n", as I'd expect)?

#include <fstream>
#include <vector>
#include <string>
#include <iostream>

using namespace std;

int main()
{
    const int bufferSize = 1024;
    ifstream input( "test.txt", ios::in | ios::binary );

    vector<char> vecBuffer( bufferSize );
    input.getline( &vecBuffer[0], bufferSize );
    string strResult( vecBuffer.begin(), vecBuffer.begin() + input.gcount() );
    cout << strResult << "\n";

    return 0;
}

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

☆獨立☆ 2024-08-01 13:42:32

我还复制了这个结果,Windows Vista,Visual Studio 2005 SP2。

当我弄清楚到底发生了什么时,我会更新这篇文章。

编辑:好的,我们开始吧。 问题(以及人们得到的不同结果)来自 \r。 发生的情况是您调用 input.getline 并将结果放入 vecBuffer 中。 getline 函数去掉 \n,但保留 \r。

然后,您将 vecBuffer 传输到字符串变量,但使用输入中的 gcount 函数,这意味着您将获得过多的一个字符,因为输入变量仍然包含 \n,而 vecBuffer 不包含。

结果 strResult 是:

-       strResult   "Test"
        [0] 84 'T'  char
        [1] 101 'e' char
        [2] 115 's' char
        [3] 116 't' char
        [4] 13 '␍'  char
        [5] 0   char

因此,打印“Test”,后跟一个回车符(将光标放回行首)、一个空字符(覆盖 T),最后是 \n,它正确地将光标位于新行上。

因此,您要么必须去掉 \r,要么编写一个直接从 vecBuffer 获取字符串长度的函数,检查是否有空字符。

I've also duplicated this result, Windows Vista, Visual Studio 2005 SP2.

When I figure out what the heck is happening, I'll update this post.

edit: Okay, there we go. The problem (and the different results people are getting) are from the \r. What happens is you call input.getline and put the result in vecBuffer. The getline function strips off the \n, but leaves the \r in place.

You then transfer the vecBuffer to a string variable, but use the gcount function from input, meaning you will get one char too much, because the input variable still contains the \n, and the vecBuffer does not.

The resulting strResult is:

-       strResult   "Test"
        [0] 84 'T'  char
        [1] 101 'e' char
        [2] 115 's' char
        [3] 116 't' char
        [4] 13 '␍'  char
        [5] 0   char

So then "Test" is printed, followed by a carriage return (puts the cursor back at the start of the line), a null character (overwriting the T), and finally the \n, which correctly puts the cursor on the new line.

So you either have to strip out the \r, or write a function that gets the string length directly from vecBuffer, checking for null characters.

-小熊_ 2024-08-01 13:42:32

我在 Windows XP Pro Service Pack 2 系统上使用 Visual Studio 2005 SP2(实际上,它显示“版本 8.0.50727.879”)编译的代码(作为控制台项目构建)重复了 Tommy 的问题。

如果我的 test.txt 文件仅包含“Test”和 CR,则程序在运行时会输出“est”(注意前导空格)。

如果我不得不大胆猜测,我会说这个版本的实现有一个错误,它将 Windows 换行符视为在 Unix 中处理(作为“转到同一行的前面”)字符),然后它会擦除第一个字符以保留下一个提示或其他内容的一部分。


更新:
玩了一下之后,我确信这就是正在发生的事情。 如果您在调试器中查看 strResult,您将看到它在末尾复制了十进制 13 值。 这就是 CR,在 Windows 领域是“\n”,而在其他地方则是“返回行首”。 如果我将构造函数更改为:

string strResult( vecBuffer.begin(), vecBuffer.begin() + input.gcount() - 1 );

...(这样 CR 就不会被复制)然后它会像您期望的那样打印出“Test”。

I've duplicated Tommy's problem on a Windows XP Pro Service Pack 2 system with the code compiled using Visual Studio 2005 SP2 (actually, it says "Version 8.0.50727.879"), built as a console project.

If my test.txt file contains just "Test" and a CR, the program spits out " est" (note the leading space) when run.

If I had to take a wild guess, I'd say that this version of the implementation has a bug where it is treating the Windows newline character like it should be treated in Unix (as a "go to the front of the same line" character), and then it wipes out the first character to hold part of the next prompt or something.


Update:
After playing with it a bit, I'm positive that is what is going on. If you look at strResult in the debugger, you will see that it copied over a decimal 13 value at the end. That's CR, which in Windows-land is '\n', and everywhere else is "return to the beginning of the line". If I instead change your constructor to read:

string strResult( vecBuffer.begin(), vecBuffer.begin() + input.gcount() - 1 );

...(so that the CR isn't copied) then it prints out "Test" like you'd expect.

苏别ゝ 2024-08-01 13:42:32

我很确定 T 实际上被写入然后被覆盖。 在 rxvt 窗口 (cygwin) 中运行相同的程序会产生预期的输出。 你可以做几件事。 如果您在打开时删除 ios::binary ,它会自动将 \r\n 转换为 \n ,并且事情会像您期望的那样工作。

您还可以通过单击打开文件对话框的打开按钮上的小向下箭头并选择打开方式...->二进制编辑器来在二进制编辑器中打开文本文件。 这将让您查看您的文件并确认它确实有 \r\n 而不仅仅是 \n。

编辑:
我将输出重定向到一个文件,它正在写出:

Test\r\0\r\n

您得到 \0 的原因是 gcount 返回 6 (从流中删除了 6 个字符),但最终的分隔符未复制到缓冲区,即 '\0 ' 是相反。 当您构造字符串时,您实际上是在告诉它包含“\0”。 std::string 对嵌入的 0 没有问题,并按要求输出。 有些 shell 显然输出一个空白字符并覆盖 T,而其他 shell 不执行任何操作,输出看起来不错,但仍然可能是错误的,因为它嵌入了 '\0' 将

cout << strResult.c_str() << "\n";

最后一行更改为此将停止在\0 并获得预期的输出。

I am pretty sure that the T is actually getting written and then overwritten. Running the same program in an rxvt window (cygwin) produces the expected output. You can do a couple things. If you get rid of the ios::binary in your open it will autoconvert \r\n to \n and things will work like you expect.

You can also open up your text file in the binary editor by clicking on the little down arrow on the open file dialog's open button and selecting open with...->Binary Editor. This will let you look at your file and confirm that it does indeed have \r\n and not just \n.

Edit:
I redirected the output to a file and it is writing out:

Test\r\0\r\n

The reason you are getting the \0 is that gcount returns 6 (6 characters were removed from the stream) but the final delimiter is not copied to the buffer, a '\0' is instead. when you are constructing the string, you are actually telling it to include the '\0'. std::string has no problem with the embedded 0 and outputs it as asked. Some shells are apparently outputting a blank character and overwriting the T, while others don't do anything and the output looks okay, but is still probably wrong because it has the embedded '\0'

cout << strResult.c_str() << "\n";

Changing the last line to this will stop on the \0 and also get the output expected.

百合的盛世恋 2024-08-01 13:42:32

我在 Windows XP Pro SP3(32 位)上使用 Visual Studio 2005 SP2 测试了您的代码,一切正常。

I tested your code using Visual Studio 2005 SP2 on Windows XP Pro SP3 (32-bit), and everything works fine.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文