泰语字符的字符编码

发布于 2024-07-30 05:55:37 字数 125 浏览 7 评论 0原文

我需要读取包含泰语字符的 RTF 文件并将其写入文本文件。 我尝试使用 TIS-620、MS874、ISO-8859-11,但当我在记事本或文本板中打开生成的输出文件时,泰语字符无法正确显示。 但它与写字板配合得很好。 请指导我。

I have a requirement to read a RTF file with Thai characters and write it to a text file. I tried using TIS-620, MS874, ISO-8859-11, but Thai characters are not displaying properly when I open the resulting output file in notepad or textpad. But it works well with Wordpad. Please guide me.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

垂暮老矣 2024-08-06 05:55:37

解决问题的代码(发布在评论中,添加到此处以使其可读!):

FileInputStream fin = new FileInputStream(fileName);
DataInputStream din = new DataInputStream(fin);
//creating a default blank styled document
DefaultStyledDocument styledDoc = new DefaultStyledDocument();
//Creating a RTF Editor kit
RTFEditorKit rtfKit = new RTFEditorKit();
//Populating the contents in the blank styled document
rtfKit.read(din,styledDoc,0);
// Getting the root document
Document doc = styledDoc.getDefaultRootElement().getDocument();
//Printing out the contents of the RTF document as plain text
System.out.println(doc.getText(0,doc.getLength()));

Code that solved the problem (posted in comment, adding here to make it readable!):

FileInputStream fin = new FileInputStream(fileName);
DataInputStream din = new DataInputStream(fin);
//creating a default blank styled document
DefaultStyledDocument styledDoc = new DefaultStyledDocument();
//Creating a RTF Editor kit
RTFEditorKit rtfKit = new RTFEditorKit();
//Populating the contents in the blank styled document
rtfKit.read(din,styledDoc,0);
// Getting the root document
Document doc = styledDoc.getDefaultRootElement().getDocument();
//Printing out the contents of the RTF document as plain text
System.out.println(doc.getText(0,doc.getLength()));
夜雨飘雪 2024-08-06 05:55:37

我不认为记事本可以处理所有字符编码,只需谷歌搜索一下。 您能否尝试将字符重新编码为 UTF-8(或其他某种 unicode 格式),因为记事本可以正确处理该字符? 您将想要使用 BOM

我还偶然发现了一个用于将泰语文件转换为各种其他编码的工具

最后,是否要求文件可以用记事本打开? 记事本并不是文本编辑的最后一个词。

I don't think notepad handles all character encodings, from a little Googling. Could you try re-encoding the characters into UTF-8 (or some other unicode format), since Notepad does handle that correctly? You'll want to use the BOM.

I also stumbled across a tool for converting files in Thai into various other encodings.

Finally, is there a requirement that the files can be opened in Notepad? It's not as if Notepad is the last word in text editing.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文