XML 解析错误:未定义的实体 - 特殊字符

发布于 2024-10-14 09:20:39 字数 690 浏览 3 评论 0原文

为什么 XML 对某些特殊字符显示错误,而有些则正常?

例如,下面会产生错误,

<?xml version="1.0" standalone="yes"?>
<Customers>
    <Customer>
        <Name>L&ouml;ic</Name>
    </Customer>
</Customers>

但这没关系,

<?xml version="1.0" standalone="yes"?>
<Customers>
    <Customer>
        <Name>&amp;</Name>
    </Customer>
</Customers>

我顺便通过 php - htmlentities('Löic',ENT_QUOTES) 转换特殊字符。

我该如何解决这个问题?

谢谢。

编辑:

我发现如果我使用诸如L&#243;ic之类的数字字符,效果很好

,现在我必须找到如何使用php将特殊字符转换为数字人物!

Why does XML display error on certain special characters and some are ok?

For instance, below will create error,

<?xml version="1.0" standalone="yes"?>
<Customers>
    <Customer>
        <Name>Löic</Name>
    </Customer>
</Customers>

but this is ok,

<?xml version="1.0" standalone="yes"?>
<Customers>
    <Customer>
        <Name>&</Name>
    </Customer>
</Customers>

I convert the special character through php - htmlentities('Löic',ENT_QUOTES) by the way.

How can I get around this?

Thanks.

EDIT:

I found that it works fine if I use numeric character such as Lóic

now I have to find how to use php to convert special characters into numeric characters!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

°如果伤别离去 2024-10-21 09:20:39

XML 规范中定义了五个实体 - &<>& ;apos;"

HTML DTD 中定义了很多实体

您不能在通用 XML 中使用 HTML 中的内容。

您可以使用数字引用,但最好只获取 字符编码 直接(基本上可以归结为:

  • 设置编辑器以 UTF-8 格式保存数据
  • 如果您使用编程语言处理数据,请确保它支持 UTF-8
  • 如果您将数据存储在数据库中,请确保将其配置为 UTF -8
  • 当您提供文档时,请确保 HTTP 标头指定它是 UTF-8(对于 XML,UTF-8 是默认值,因此不指定任何内容几乎一样好)

There are five entities defined in the XML specification — &, <, >, ' and "

There are lots of entities defined in the HTML DTD.

You can't use the ones from HTML in generic XML.

You could use numeric references, but you would probably be better off just getting your character encodings straight (which basically boils down to:

  • Set your editor to save the data in UTF-8
  • If you process the data with a programming language, make sure it is UTF-8 aware
  • If you store the data in a database, make sure it is configured for UTF-8
  • When you serve up your document, make sure the HTTP headers specify that it is UTF-8 (in the case of XML, UTF-8 is the default, so not specifying anything is almost as good)

)

一梦等七年七年为一梦 2024-10-21 09:20:39

因为它不是内置实体,而是需要在 DTD 中声明的外部实体。

Because it is not an built-in entity, it is instead an external entity that needs declaration in DTD.

忆伤 2024-10-21 09:20:39

TLDR 解决方案

您可以使用 解决此问题html_entity_decode()(来源:PHP.net),就像这样...

$xml_line = '<description>' . html_entity_decode($description) . '</description>';

完整的在线工作演示

在这个演示中,我使用以及《道德经》中的一行来演示上述 html_entity_decode() 的使用...

$title = 'The name you can say isn’t the real name.';
$xml_title = html_entity_decode($title)
$xml_title = str_replace(['<', '>',], ['<', '>',], $xml_title );
$xml_line = '<title>' . $xml_title . '</title>';
print($xml_line);

不要忘记替换回那些 <> 字符,不过!

工作演示沙盒

您如何知道它有效?

想要验证它是否正常工作?然后前往W3C RSS Feed Validator,并看到上面的代码被批准就好了。

TLDR Solution

You can solve this problem with html_entity_decode() (Source: PHP.net), like so...

$xml_line = '<description>' . html_entity_decode($description) . '</description>';

Full, Working Demo Online

In this demo, I use and a line from the Tao teh Ching to demonstrate the above use of html_entity_decode()...

$title = 'The name you can say isn’t the real name.';
$xml_title = html_entity_decode($title)
$xml_title = str_replace(['<', '>',], ['<', '>',], $xml_title );
$xml_line = '<title>' . $xml_title . '</title>';
print($xml_line);

Don't forget to replace back those < and > chars, though!

Working Demo Sandbox

How Do You Know It Worked?

Want to verify it worked just fine? Then head on over to the W3C RSS Feed Validator, and see the above code being approved as just fine.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文