XMLReader 遇到奇怪的字符时中断

发布于 2024-12-11 01:04:13 字数 169 浏览 0 评论 0原文

每当 XMLReader 尝试解析这个 XML 文件时,它都会在“½”和看起来像这样的“.”的句点上中断。

这两个字符每当我尝试从 xml feed 中删除它们时,编辑器都会首先删除它们前面的字符。因此,它们的行为就像外国/不同的编码字符。

我有哪些解决方案?我无法每次都编辑xml文件。多谢

Whenever XMLReader tried to parse this XML file Im feeding it, it breaks on "½" and on a period that looks like this "."

Both are characters that whenever I try to delete them from the xml feed, the editor deletes the characters in front of them first. So, they act like foreign/different encoding characters.

What are my options to fix it? I can't edit the xml file every time. Thanks a lot

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

痴骨ら 2024-12-18 01:04:13

您必须修复创建“XML”文件的程序或进程。 (我将“XML”放在引号中,因为实际上,您希望它是一个 XML 文件,但它不是一个。)您也许能够修补、修复或恢复数据,但这不是长期的解决方案。

轶事证据表明“½”字符被编码为两个字节,表明它被编码为 UTF-8,而“é”字符被编码为一个字节,表明它被编码为 ISO 8859-1。这意味着两个不同的进程已写入该文件,并使用不同的编码写入该文件。 (也许它最初是用一种编码创建的,然后使用不知道原始编码是什么的编辑器进行修改。)这是行不通的。

You have to fix the program or process that creates the "XML" file. (I put "XML" in quotes, because actually, you would like it to be an XML file, but it isn't one.) You might be able to patch or repair or recover the data, but that's not a long-term solution.

The anecdotal evidence suggests that the "½" character is encoded as two bytes, suggesting it is encoded as UTF-8, while the "é" character is encoded as one byte, suggesting it is encoded as ISO 8859-1. That means that two different processes have written to the file, writing to it using different encodings. (Perhaps it was originally created in one encoding, and then modified using an editor that didn't know what the original encoding was.) That isn't going to work.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文