C++ - 当输出到文本文件与控制台输出不同时, string.compare 问题?

发布于 2024-08-11 06:47:43 字数 1206 浏览 2 评论 0原文

为了单元测试的目的,我试图找出我拥有的两个字符串是否相同。第一个是预定义的字符串,硬编码到程序中。第二个是使用 std::getline() 从带有 ifstream 的文本文件中读取,然后将其作为子字符串。这两个值都存储为 C++ 字符串。

当我使用 cout 将两个字符串输出到控制台进行测试时,它们看起来都是相同的:

ThisIsATestStringOutputtedToAFile ThisIsATestStringOutputtedToAFile

但是, string.compare 返回表明它们不相等。输出到文本文件时,两个字符串显示如下:

ThisIsATestStringOutputtedToAFile T^@h^@i^@s^@I^@s^@A^@T^@e^@s^@t^@S^@t^@r^@i^@n^@g^ @O^@u^@t^@p^@u^@t^@ t^@e^@d^@T^@o^@A^@F^@i^@l^@e

我猜这是某种编码问题,如果我用我的母语(好旧的 C#),我不会有太多问题。事实上,我使用的是 C/C++ 和 Vi,坦率地说,我真的不知道接下来该去哪里!我尝试过查看可能与 ansi/unicode 之间的转换,并删除奇怪的字符,但我什至不确定它们是否真的存在。

提前感谢您的任何建议。

编辑 抱歉,这是我第一次在这里发帖。下面的代码是我如何完成这个过程:

ifstream myInput;
ofstream myOutput;

myInput.open(fileLocation.c_str()); 
myOutput.open("test.txt");

TEST_ASSERT(myInput.is_open() == 1);

string compare1 = "ThisIsATestStringOutputtedToAFile";
string fileBuffer;

std::getline(myInput, fileBuffer);
string compare2 = fileBuffer.substr(400,100);

cout << compare1 + "\n";
cout << compare2 + "\n";
myOutput << compare1 + "\n";
myOutput << compare2 + "\n";
cin.get();

myInput.close();
myOutput.close();

TEST_ASSERT(compare1.compare(compare2) == 0);

I'm trying to find out if two strings I have are the same, for the purpose of unit testing. The first is a predefined string, hard-coded into the program. The second is a read in from a text file with an ifstream using std::getline(), and then taken as a substring. Both values are stored as C++ strings.

When I output both of the strings to the console using cout for testing, they both appear to be identical:

ThisIsATestStringOutputtedToAFile
ThisIsATestStringOutputtedToAFile

However, the string.compare returns stating they are not equal. When outputting to a text file, the two strings appear as follows:

ThisIsATestStringOutputtedToAFile
T^@h^@i^@s^@I^@s^@A^@T^@e^@s^@t^@S^@t^@r^@i^@n^@g^@O^@u^@t^@p^@u^@t^@
t^@e^@d^@T^@o^@A^@F^@i^@l^@e

I'm guessing this is some kind of encoding problem, and if I was in my native language (good old C#), I wouldn't have too many problems. As it is I'm with C/C++ and Vi, and frankly don't really know where to go from here! I've tried looking at maybe converting to/from ansi/unicode, and also removing the odd characters, but I'm not even sure if they really exist or not..

Thanks in advance for any suggestions.

EDIT
Apologies, this is my first time posting here. The code below is how I'm going through the process:

ifstream myInput;
ofstream myOutput;

myInput.open(fileLocation.c_str()); 
myOutput.open("test.txt");

TEST_ASSERT(myInput.is_open() == 1);

string compare1 = "ThisIsATestStringOutputtedToAFile";
string fileBuffer;

std::getline(myInput, fileBuffer);
string compare2 = fileBuffer.substr(400,100);

cout << compare1 + "\n";
cout << compare2 + "\n";
myOutput << compare1 + "\n";
myOutput << compare2 + "\n";
cin.get();

myInput.close();
myOutput.close();

TEST_ASSERT(compare1.compare(compare2) == 0);

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

皇甫轩 2024-08-18 06:47:43

您是如何创建 myInput 的内容的?我猜想这个文件是用两字节编码创建的。您可以使用 hex-dump 来验证这个理论,或者使用不同的编辑器来创建这个文件。

最简单的方法是启动 cmd.exe 并输入

echo "ThisIsATestStringOutputtedToAFile" > test.txt

UPDATE:

如果您无法更改 myInput 文件的编码,您可以尝试在程序中使用宽字符。即使用 wstring 代替 stringwifstream 代替 ifstreamwofstream、<代码>wcout等

How did you create the content of myInput? I would guess that this file is created in two-byte encoding. You can use hex-dump to verify this theory, or use a different editor to create this file.

The simpliest way would be to launch cmd.exe and type

echo "ThisIsATestStringOutputtedToAFile" > test.txt

UPDATE:

If you cannot change the encoding of the myInput file, you can try to use wide-chars in your program. I.e. use wstring instead of string, wifstream instead of ifstream, wofstream, wcout, etc.

狂之美人 2024-08-18 06:47:43

以下内容对我有用,并将下面粘贴的文本写入文件中。请注意嵌入到字符串中的 '\0' 字符。

#include <iostream>
#include <fstream>
#include <sstream>

int main()
{
    std::istringstream myInput("0123456789ThisIsATestStringOutputtedToAFile\x0 12ou 9 21 3r8f8 reohb jfbhv jshdbv coerbgf vibdfjchbv jdfhbv jdfhbvg jhbdfejh vbfjdsb vjdfvb jfvfdhjs jfhbsd jkefhsv gjhvbdfsjh jdsfhb vjhdfbs vjhdsfg kbhjsadlj bckslASB VBAK VKLFB VLHBFDSL VHBDFSLHVGFDJSHBVG LFS1BDV LH1BJDFLV HBDSH VBLDFSHB VGLDFKHB KAPBLKFBSV LFHBV YBlkjb dflkvb sfvbsljbv sldb fvlfs1hbd vljkh1ykcvb skdfbv nkldsbf vsgdb lkjhbsgd lkdcfb vlkbsdc xlkvbxkclbklxcbv");
    std::ofstream myOutput("test.txt");
    //std::ostringstream myOutput;

    std::string str1 = "ThisIsATestStringOutputtedToAFile";
    std::string fileBuffer;

    std::getline(myInput, fileBuffer);
    std::string str2 = fileBuffer.substr(10,100);

    std::cout << str1 + "\n";
    std::cout << str2 + "\n";
    myOutput << str1 + "\n";
    myOutput << str2 + "\n";

    std::cout << str1.compare(str2) << '\n';

    //std::cout << myOutput.str() << '\n';
    return 0;
}

输出:

ThisIsATestStringOutputtedToAFile
ThisIsATestStringOutputtedToAFile

The following works for me and writes the text pasted below into the file. Note the '\0' character embedded into the string.

#include <iostream>
#include <fstream>
#include <sstream>

int main()
{
    std::istringstream myInput("0123456789ThisIsATestStringOutputtedToAFile\x0 12ou 9 21 3r8f8 reohb jfbhv jshdbv coerbgf vibdfjchbv jdfhbv jdfhbvg jhbdfejh vbfjdsb vjdfvb jfvfdhjs jfhbsd jkefhsv gjhvbdfsjh jdsfhb vjhdfbs vjhdsfg kbhjsadlj bckslASB VBAK VKLFB VLHBFDSL VHBDFSLHVGFDJSHBVG LFS1BDV LH1BJDFLV HBDSH VBLDFSHB VGLDFKHB KAPBLKFBSV LFHBV YBlkjb dflkvb sfvbsljbv sldb fvlfs1hbd vljkh1ykcvb skdfbv nkldsbf vsgdb lkjhbsgd lkdcfb vlkbsdc xlkvbxkclbklxcbv");
    std::ofstream myOutput("test.txt");
    //std::ostringstream myOutput;

    std::string str1 = "ThisIsATestStringOutputtedToAFile";
    std::string fileBuffer;

    std::getline(myInput, fileBuffer);
    std::string str2 = fileBuffer.substr(10,100);

    std::cout << str1 + "\n";
    std::cout << str2 + "\n";
    myOutput << str1 + "\n";
    myOutput << str2 + "\n";

    std::cout << str1.compare(str2) << '\n';

    //std::cout << myOutput.str() << '\n';
    return 0;
}

Output:

ThisIsATestStringOutputtedToAFile
ThisIsATestStringOutputtedToAFile
酒绊 2024-08-18 06:47:43

原来问题出在myInput的文件编码是UTF-16,而比较字符串是UTF-8。根据我对该项目的操作系统限制(Linux、C/C++ 代码)来转换它们的方法是使用 iconv() 函数。为了保持我一直使用的 C++ 字符串的兼容性,我最终将字符串保存到一个新的文本文件中,然后通过 system() 命令运行 iconv。

system("iconv -f UTF-16 -t UTF-8 subStr.txt -o convertedSubStr.txt");

然后读回输出的字符串,得到了比较正常工作所需格式的字符串。

笔记
我知道这不是最有效的方法。如果我拥有 Windows 环境和 windows.h 库,事情就会容易得多。但在本例中,代码是在一些很少使用的单元测试中,因此不需要高度优化,因此某些文本文件的创建、销毁和 I/O 操作不是问题。

It turns out that the problem was that the file encoding of myInput was UTF-16, whereas the comparison string was UTF-8. The way to convert them with the OS limitations I had for this project (Linux, C/C++ code), was to use the iconv() functions. To keep the compatibility of the C++ strings I'd been using, I ended up saving the string to a new text file, then running iconv through the system() command.

system("iconv -f UTF-16 -t UTF-8 subStr.txt -o convertedSubStr.txt");

Reading the outputted string back in then gave me the string in the format I needed for the comparison to work properly.

NOTE
I'm aware that this is not the most efficient way to do this. I've I'd had the luxury of a Windows environment and the windows.h libraries, things would have been a lot easier. In this case though, the code was in some rarely used unit tests, and as such didn't need to be highly optimized, hence the creation, destruction and I/O operations of some text files wasn't an issue.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文