如何解决“simplexml_load_file() 解析器错误:实体”问题未定义”?
我使用 PHP 生成 XML 文件。我使用下面的一些代码来避免错误。
$str = str_ireplace(array('<','>','&','\'','"'),array('<','>','&',''','"'),$str);
但仍然造成故障。
simplexml_load_file() [function.simplexml-load-file] *[file name]* parser error : Entity 'nbsp' not defined in *[file name] [line]*
错误文本在这里:
Dallas Dallas () is the third-largest city in Texas and the ninth-largest in the United States.
在 IE8 中,它似乎在 ()
中出错。那么我应该注意多少个符号呢?
I use PHP to generate XML files. I have use some code below to avoid error.
$str = str_ireplace(array('<','>','&','\'','"'),array('<','>','&',''','"'),$str);
but still cause fault.
simplexml_load_file() [function.simplexml-load-file] *[file name]* parser error : Entity 'nbsp' not defined in *[file name] [line]*
The error text here:
Dallas Dallas () is the third-largest city in Texas and the ninth-largest in the United States.
In IE8, it seems to fault in ()
. So how many symbols should I notice?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
HTML 特定实体 - 在本例中
- 不是有效的 xml 实体,这就是 simplexml 所抱怨的;它以 xml(而不是 html)形式读取文件并查找无效实体。您需要首先将 HTML 实体转换回其字符表示形式(您可以使用 html_entity_decode () 来做到这一点)
请注意,如果您在将字符串保存到 xml 之前对字符串使用 htmlentities() ,那么这就是问题的根源(因为您将 html 字符转换为各自的 html 实体,这不被认可simplexml 作为 xml 实体)。
如果您在理解它时遇到困难,请将其视为两种不同的语言,例如西班牙语 (html) 和英语 (xml),西班牙语 ( ) 中的有效单词并不意味着它在英语中也有效,无论有何相似之处两种语言之间。
HTML specific entities - in this case
- are not valid xml entities, and that is what simplexml complains about; it reads the file as xml (not html) and finds entities which are not valid. You need to convert HTML entities back to their character representation first (you can use html_entity_decode() to do that)
Note that if you use htmlentities() on your string before saving it in the xml, then that is the source of your problem (as you are converting html character to their respective html entities, which are not recognized by simplexml as xml entities).
If you have troubles understanding it, think of it as two different languages, like spanish (html) and english (xml), a valid word in spanish ( ) doesn't mean it is also valid in english, no matter the similarities between the two languages.
要么摆脱它(你没有说它来自哪里,所以很难给出任何更具体的建议),要么将你的 HTML 数据包装在
CDATA
块,因此解析器会忽略它们。Either get rid of it (you're not saying where it comes from, so it's hard to give any more specific advice), or wrap your HTML data in
CDATA
blocks so the parser ignores them.您还可以使用
、
htmlentities($str, ENT_XML1 | ENT_QUOTES)
(htmlentities),仅使用 XML 实体,而不使用 HTML(例如&ndash
,«
、»
等)You also may use
,
htmlentities($str, ENT_XML1 | ENT_QUOTES)
(htmlentities), which use only XML entities, not HTML (like&ndash
,«
,»
, etc.)