C++阅读带有口音的文件
美好的一天,我在一个小项目中,我需要读取 .txt 文件,问题是有些是英语,有些是西班牙语,所呈现的情况是某些信息带有重音,我必须将其显示在带有重音的控制台。
我可以使用 setlocale(LC_CTYPE, "C");
在控制台上显示重音符号,但
我的问题是在读取中读取 .txt 文件时,它不会检测重音符号并读取罕见字符。
我的练习代码是:
#include <iostream>
#include <locale.h>
#include<fstream>
#include<string>
using namespace std;
int main(){
setlocale (LC_CTYPE, "C");
ifstream file;
string text;
file.open("entryDisciplineESP.txt",ios::in);
if (file.fail()){
cout<<"The file could not be opened."<<endl;
exit(1);
}
while(!file.eof()){
getline(file,text);
cout<<text<<endl;
}
cout<<endl;
system("Pause");
return 0;
}
有问题的 .txt 文件包含:
Inicio
D1
Biatlón
S1
255
E1
Esprint 7,5 km (M); 100; 200
E2
Persecucion 10 km (M); 100; 200
ff
显然我遇到了 'ó' 问题,但同样,我还有其他 .txt 和其他带有重音符号的字符,所以我需要所有这些字符的解决方案。
研究中我已经阅读并尝试实现 wstring 和 wifstream 但我未能成功实现。
我正在尝试在 Windows 上实现这一目标,就像我需要在 Linux 上工作的解决方案一样,目前我正在使用 dev c++ 5.11
提前非常感谢您的时间和帮助。
Good day, I am in a small project where I need to read .txt files, the problem is that some are in English and others in Spanish, the case is being presented in which some information comes with an accent and I must show it on the console with the accent.
I have no problem displaying accents on console with setlocale(LC_CTYPE, "C");
my problem is when reading the .txt file in the reading it does not detect the accents and reads rare characters.
my practice code is:
#include <iostream>
#include <locale.h>
#include<fstream>
#include<string>
using namespace std;
int main(){
setlocale (LC_CTYPE, "C");
ifstream file;
string text;
file.open("entryDisciplineESP.txt",ios::in);
if (file.fail()){
cout<<"The file could not be opened."<<endl;
exit(1);
}
while(!file.eof()){
getline(file,text);
cout<<text<<endl;
}
cout<<endl;
system("Pause");
return 0;
}
The .txt file in question contains:
Inicio
D1
Biatlón
S1
255
E1
Esprint 7,5 km (M); 100; 200
E2
Persecucion 10 km (M); 100; 200
ff
obviously I'm having problems with 'ó' but in the same way I have other .txt with other characters with accents so I need a solution for all these characters.
Researching I have read and tried to implement wstring and wifstream but I have not been able to implement that successfully.
I'm trying to achieve this on windows, the same way I need the solution to work on linux, at the moment I'm using dev c++ 5.11
Thank you very much in advance for your time and help.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
你的错误在于你如何控制你的读取循环。请参阅:为什么循环条件内的 !.eof() 总是错误的。 相反,用流控制读取循环-由您的读取函数返回的状态,例如
所讨论的字符是简单的扩展 ASCII(例如
c3
),并且可以轻松地用std::string
表示并使用std::cout
。您的完整示例,修复 为什么是“using namespace std;”被认为是不好的做法吗?将是示例输出
Windows 10使用UTF-8代码页
您尝试在Windows 10控制台下运行上述代码时遇到的问题(我假设是 DevC++ 启动输出的内容),默认代码页(
437 - OEM United States
)不支持 UTF-8 字符。要将代码页更改为 UTF-8,您将使用 (65001 - Unicode (UTF-8)
)。请参阅代码页标识符以获取正确的在 VS 下使用 C++17 语言标准编译后的输出,所需要的只是在控制台中使用 chcp 65001 更改代码页。 (您还必须有 UTF-8 字体,我的设置为
Lucida Console
)设置代码页后在 Windows 控制台(命令提示符)中输出
您还需要设置由于 DevC++ 自动启动控制台,因此以编程方式更改代码页。您可以使用
SetConsoleOutputCP (65001)
来完成此操作。例如:请参阅 SetConsoleOutputCP 函数。用于设置输入代码页的类似函数是
SetConsoleCP(uint codepage)
。使用 SetConsoleOutputCP() 进行输出
将控制台设置为默认
437
代码页,然后使用SetConsoleOutputCP (65001)
将输出代码页设置为 UTF-8,你会得到同样的结果,例如此外,检查 DevC++ 项目(或程序)设置并检查是否可以在那里设置输出代码页。 (我没用过,所以不知道是否可行)。
Your error is how you control your read-loop. See: Why !.eof() inside a loop condition is always wrong. Instead, control your read-loop with the stream-state returned by your read-function, e.g.
The character in question is simple extended ASCII (e.g.
c3
) and easily representable instd::string
and withstd::cout
. Your full example, fixing Why is “using namespace std;” considered bad practice? would beExample Output
Windows 10 Using UTF-8 Codepage
The problem you experience attempting to run the above code under Windows 10 console (which I presume is what DevC++ is launching output in), is the default codepage (
437 - OEM United States
) does not support UTF-8 characters. To change the codepage to UTF-8, you will use (65001 - Unicode (UTF-8)
). See Code Page IdentifiersTo get the proper output after compiling under VS with the C++17 language standard, all that was needed was to change the codepage using
chcp 65001
in the console. (you also must have an UTF-8 font, mine is set toLucida Console
)Output In Windows Console (Command Prompt) After Setting Codepage
You have the additional need to set the codepage programmatically due to DevC++ automatically launching the console. You can do that using
SetConsoleOutputCP (65001)
. For example:See SetConsoleOutputCP function. The analogous function for setting the input codepage is
SetConsoleCP(uint codepage)
.Output Using SetConsoleOutputCP()
Setting the console to the default
437
codepage and then usingSetConsoleOutputCP (65001)
to set output codepage to UTF-8, you get the same thing, e.g.Also, check the DevC++ project (or program) settings and check whether you can set the output codepage there. (I don't use it, so don't know if it is possible).