C++ - 当输出到文本文件与控制台输出不同时, string.compare 问题?
为了单元测试的目的,我试图找出我拥有的两个字符串是否相同。第一个是预定义的字符串,硬编码到程序中。第二个是使用 std::getline() 从带有 ifstream 的文本文件中读取,然后将其作为子字符串。这两个值都存储为 C++ 字符串。
当我使用 cout 将两个字符串输出到控制台进行测试时,它们看起来都是相同的:
ThisIsATestStringOutputtedToAFile ThisIsATestStringOutputtedToAFile
但是, string.compare 返回表明它们不相等。输出到文本文件时,两个字符串显示如下:
ThisIsATestStringOutputtedToAFile T^@h^@i^@s^@I^@s^@A^@T^@e^@s^@t^@S^@t^@r^@i^@n^@g^ @O^@u^@t^@p^@u^@t^@ t^@e^@d^@T^@o^@A^@F^@i^@l^@e
我猜这是某种编码问题,如果我用我的母语(好旧的 C#),我不会有太多问题。事实上,我使用的是 C/C++ 和 Vi,坦率地说,我真的不知道接下来该去哪里!我尝试过查看可能与 ansi/unicode 之间的转换,并删除奇怪的字符,但我什至不确定它们是否真的存在。
提前感谢您的任何建议。
编辑 抱歉,这是我第一次在这里发帖。下面的代码是我如何完成这个过程:
ifstream myInput;
ofstream myOutput;
myInput.open(fileLocation.c_str());
myOutput.open("test.txt");
TEST_ASSERT(myInput.is_open() == 1);
string compare1 = "ThisIsATestStringOutputtedToAFile";
string fileBuffer;
std::getline(myInput, fileBuffer);
string compare2 = fileBuffer.substr(400,100);
cout << compare1 + "\n";
cout << compare2 + "\n";
myOutput << compare1 + "\n";
myOutput << compare2 + "\n";
cin.get();
myInput.close();
myOutput.close();
TEST_ASSERT(compare1.compare(compare2) == 0);
I'm trying to find out if two strings I have are the same, for the purpose of unit testing. The first is a predefined string, hard-coded into the program. The second is a read in from a text file with an ifstream using std::getline(), and then taken as a substring. Both values are stored as C++ strings.
When I output both of the strings to the console using cout for testing, they both appear to be identical:
ThisIsATestStringOutputtedToAFile
ThisIsATestStringOutputtedToAFile
However, the string.compare returns stating they are not equal. When outputting to a text file, the two strings appear as follows:
ThisIsATestStringOutputtedToAFile
T^@h^@i^@s^@I^@s^@A^@T^@e^@s^@t^@S^@t^@r^@i^@n^@g^@O^@u^@t^@p^@u^@t^@
t^@e^@d^@T^@o^@A^@F^@i^@l^@e
I'm guessing this is some kind of encoding problem, and if I was in my native language (good old C#), I wouldn't have too many problems. As it is I'm with C/C++ and Vi, and frankly don't really know where to go from here! I've tried looking at maybe converting to/from ansi/unicode, and also removing the odd characters, but I'm not even sure if they really exist or not..
Thanks in advance for any suggestions.
EDIT
Apologies, this is my first time posting here. The code below is how I'm going through the process:
ifstream myInput;
ofstream myOutput;
myInput.open(fileLocation.c_str());
myOutput.open("test.txt");
TEST_ASSERT(myInput.is_open() == 1);
string compare1 = "ThisIsATestStringOutputtedToAFile";
string fileBuffer;
std::getline(myInput, fileBuffer);
string compare2 = fileBuffer.substr(400,100);
cout << compare1 + "\n";
cout << compare2 + "\n";
myOutput << compare1 + "\n";
myOutput << compare2 + "\n";
cin.get();
myInput.close();
myOutput.close();
TEST_ASSERT(compare1.compare(compare2) == 0);
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您是如何创建
myInput
的内容的?我猜想这个文件是用两字节编码创建的。您可以使用 hex-dump 来验证这个理论,或者使用不同的编辑器来创建这个文件。最简单的方法是启动 cmd.exe 并输入
UPDATE:
如果您无法更改
myInput
文件的编码,您可以尝试在程序中使用宽字符。即使用wstring
代替string
、wifstream
代替ifstream
、wofstream
、<代码>wcout等How did you create the content of
myInput
? I would guess that this file is created in two-byte encoding. You can use hex-dump to verify this theory, or use a different editor to create this file.The simpliest way would be to launch cmd.exe and type
UPDATE:
If you cannot change the encoding of the
myInput
file, you can try to use wide-chars in your program. I.e. usewstring
instead ofstring
,wifstream
instead ofifstream
,wofstream
,wcout
, etc.以下内容对我有用,并将下面粘贴的文本写入文件中。请注意嵌入到字符串中的
'\0'
字符。输出:
The following works for me and writes the text pasted below into the file. Note the
'\0'
character embedded into the string.Output:
原来问题出在myInput的文件编码是UTF-16,而比较字符串是UTF-8。根据我对该项目的操作系统限制(Linux、C/C++ 代码)来转换它们的方法是使用 iconv() 函数。为了保持我一直使用的 C++ 字符串的兼容性,我最终将字符串保存到一个新的文本文件中,然后通过 system() 命令运行 iconv。
然后读回输出的字符串,得到了比较正常工作所需格式的字符串。
笔记
我知道这不是最有效的方法。如果我拥有 Windows 环境和 windows.h 库,事情就会容易得多。但在本例中,代码是在一些很少使用的单元测试中,因此不需要高度优化,因此某些文本文件的创建、销毁和 I/O 操作不是问题。
It turns out that the problem was that the file encoding of myInput was UTF-16, whereas the comparison string was UTF-8. The way to convert them with the OS limitations I had for this project (Linux, C/C++ code), was to use the iconv() functions. To keep the compatibility of the C++ strings I'd been using, I ended up saving the string to a new text file, then running iconv through the system() command.
Reading the outputted string back in then gave me the string in the format I needed for the comparison to work properly.
NOTE
I'm aware that this is not the most efficient way to do this. I've I'd had the luxury of a Windows environment and the windows.h libraries, things would have been a lot easier. In this case though, the code was in some rarely used unit tests, and as such didn't need to be highly optimized, hence the creation, destruction and I/O operations of some text files wasn't an issue.