在 PHP DOM 中加载无效的 XML
我输入的 XML 文件格式不正确(即它有“&”而不是“& amp;”) 当我尝试使用 PHP DOM 加载此 XML 时,$doc->load("file.xml") 它会抛出错误并停止解析。
有什么办法可以加载这个未格式化的 XML 吗?不,我无法编辑源 XML 文件。 我确实尝试使用 $doc->loadHTML() 但它到处抛出错误。
我想知道是否有正确的方法来执行此操作(例如加载文件内容并使用正则表达式或类似的方法更改它)
I have and input XML file that is not correctly formatted ( ie. it has '&' instead of '& amp;')
When i try to load this XML using PHP DOM, $doc->load("file.xml") it throws and error and stops the parsing.
Is there any way to load this un-formatted XML? and No I cant edit the source XML file.
I did try using $doc->loadHTML() but it throws errors all over the place.
I wanted to know if there is a proper way to do this (like load file contents and change it using regex or something similar)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
在通过
$doc->loadHTML(...)
加载 XML 之前,尝试设置$doc->validateOnParse = false;
。Try setting
$doc->validateOnParse = false;
before loading your XML via$doc->loadHTML(...)
.首先,检查是否是
&
导致了错误,而不是其他原因。不管怎样,您必须修改 XML 才能对其进行解析。
loadHTML
中的 HTML 是从字符串加载的,您不能将无效字符替换为正确的字符吗?如果您的安装支持 PHP Tidy 扩展 (http://php.net/manual/en/book.tidy.php),您可以尝试用它来清理它,尽管根据我的经验,这远非万无一失。
First, check that it's the
&
that's causing the error and not something else.One way or another, you'll have to modify the XML to get it parsed. The HTML in
loadHTML
is loaded from a string, can't you just replace the invalid characters with the correct ones?If your installation supports the PHP Tidy extension (http://php.net/manual/en/book.tidy.php) you could try to clean it up with that, though in my experience it's far from foolproof.
如果您确定这是唯一导致其无法验证的原因,那么您可以尝试使用
file_get_contents()
函数将文件加载到字符串中,然后搜索 &通过替换字符串将 & 更改为&
,然后将该字符串放入 simpleXML 中,如$xml = simplexml_load_string($cleaned_string);
If you are sure that's the only thing making it not validate, then you could try loading the file into a string with
file_get_contents()
function, then search & replace through the string to change the &'s into&
's, then place that string into simpleXML like$xml = simplexml_load_string($cleaned_string);