使用 php Simple XML 获取节点的文本部分
给定 php 代码:
$xml = <<<EOF
<articles>
<article>
This is a link
<link>Title</link>
with some text following it.
</article>
</articles>
EOF;
function traverse($xml) {
$result = "";
foreach($xml->children() as $x) {
if ($x->count()) {
$result .= traverse($x);
}
else {
$result .= $x;
}
}
return $result;
}
$parser = new SimpleXMLElement($xml);
traverse($parser);
我期望函数 traverse() 返回:
This is a link Title with some text following it.
但是,它仅返回:
Title
有没有办法使用 simpleXML 获得预期结果(显然是为了使用数据,而不是像这个简单的那样仅返回数据)例子)?
Given the php code:
$xml = <<<EOF
<articles>
<article>
This is a link
<link>Title</link>
with some text following it.
</article>
</articles>
EOF;
function traverse($xml) {
$result = "";
foreach($xml->children() as $x) {
if ($x->count()) {
$result .= traverse($x);
}
else {
$result .= $x;
}
}
return $result;
}
$parser = new SimpleXMLElement($xml);
traverse($parser);
I expected the function traverse() to return:
This is a link Title with some text following it.
However, it returns only:
Title
Is there a way to get the expected result using simpleXML (obviously for the purpose of consuming the data rather than just returning it as in this simple example)?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
可能有一些方法可以仅使用 SimpleXML 来实现您想要的效果,但在这种情况下,最简单的方法是使用 DOM.好消息是,如果您已经在使用 SimpleXML,则无需更改任何内容,因为 DOM 和 SimpleXML 基本上可以互换:
假设您的任务是迭代每个
并获取其内容,您的代码将如下所示
There might be ways to achieve what you want using only SimpleXML, but in this case, the simplest way to do it is to use DOM. The good news is if you're already using SimpleXML, you don't have to change anything as DOM and SimpleXML are basically interchangeable:
Assuming your task is to iterate over each
<article/>
and get its content, your code will look like所以,我的问题的简单答案是:Simplexml 无法处理这种 XML。请改用 DomDocument。
此示例展示了如何遍历整个 XML。看起来 DomDocument 可以处理任何 XML,而 SimpleXML 要求 XML 简单。
您还可以使用 simpleXML 加载 xml,然后使用 dom_import_simplexml() 将其转换为 DOM,如 Josh 所说。如果您使用 simpleXml 来过滤要解析的节点(例如使用 XPath),这将很有用。
然而,我实际上并没有使用 simpleXML,所以对我来说这将花费很长的时间。
感谢您的所有帮助!
So, the simple answer to my question was: Simplexml can't process this kind of XML. Use DomDocument instead.
This example shows how to traverse the entire XML. It seems that DomDocument will work with any XML whereas SimpleXML requires the XML to be simple.
You could also load the xml using simpleXML and then convert it to DOM using dom_import_simplexml() as Josh said. This would be useful, if you are using simpleXml to filter nodes for parsing, e.g. using XPath.
However, I don't actually use simpleXML, so for me that would be taking the long way around.
Thank you for all the help!
您可以使用 simplexml 获取 DOM 元素的文本节点,只需将其视为字符串即可:
但是,这会打印出:
..因为文本节点被视为一个块,并且无法知道子节点在内部的位置文本节点。由于另一个 else {},子节点也被添加了两次,但您可以将其取出。
抱歉,如果我没有太大帮助,但我认为没有任何方法可以找出子节点适合文本节点的位置,除非 xml 一致(但是,为什么不使用标签)。如果您知道要从哪个元素中删除文本,则
strip_tags()
会很好用。You can get the text node of a DOM element with simplexml just by treating it like a string:
However, this prints out:
..because the text node is treated as one block and there is no way to tell where the child fits in inside the text node. The child node is also added twice because of the other else {}, but you can just take that out.
Sorry if I didn't help much, but I don't think there's any way to find out where the child node fits in the text node unless the xml is consistent (but then, why not use tags). If you know what element you want to strip the text out of,
strip_tags()
will work great.这已经得到了回答,但是 CASTING TO STRING (即 $sString = (string) oSimpleXMLNode->TagName)始终对我有用。
This has already been answered, but CASTING TO STRING ( i.e. $sString = (string) oSimpleXMLNode->TagName) always worked for me.
试试这个:
这几乎相当于:
Try this:
That's pretty much equivalent to:
就像@tandu所说,这是不可能的,但如果你可以修改你的XML,这将起作用:
Like @tandu said, it's not possible, but if you can modify your XML, this will work: