Dom4j 分离节点、Jython
我正在使用 Dom4j 分离节点,如下所示:
<div name="divName">
Some Text Here
<span>Some Text Here</span>
</div>
我按名称选择 div 节点,然后使用分离方法将其删除:
xpathValue = "//*[contains(@name, 'divName')]"
xpath = dom.createXPath(xpathValue)
if xpath != None:
nodes = xpath.selectNodes(dom)
if len(nodes) > 0:
for node in nodes:
node.detach()
这似乎很好地删除了 div,我注意到它还删除了该 div 中的元素和文本还。 我想要实现的是删除 div 而不删除 div 内的元素和文本,从而导致:
Some Text Here
<span>Some Text Here</span>
是否可以使用 dom4j 实现此目的? 如果没有关于如何解决这个问题的任何建议?
干杯
Eef
更新:
@alamar
通过获取您的代码并对其进行一些编辑,我已经实现了我想要的目标,这就是我想到的:
xpathValue = "//*[contains(@name, 'divName')]"
xpath = dom.createXPath(xpathValue)
if xpath != None:
nodes = xpath.selectNodes(dom)
if len(nodes) > 0:
for node in nodes:
parent = node.getParent()
nodeContents = node.content()
if len(nodeContents) > 0:
for subNode in nodeContents:
parent.add(subNode.clone().detach())
node.detach()
这似乎可行,但将节点添加到末尾在以下情况下的父节点:
<div name="parent">
<div name="divName">
Some Text Here
<span>Some Text Here</span>
</div>
<div name="keep"></div>
</div>
结果是这样的:
<div name="parent">
<div name="keep"></div>
Some Text Here
<span>Some Text Here</span>
</div>
我试图弄清楚如何让已删除节点的内容保留在其原始位置,在名为“keep”的 div 之前,而不是添加在div 名称为“keep”。 我尝试了一些方法,但似乎无法实现这一目标,有人可以帮忙吗?
埃夫
I am using Dom4j to detach a node, like below:
<div name="divName">
Some Text Here
<span>Some Text Here</span>
</div>
I am selecting the div node by name and then using the detach method to remove it:
xpathValue = "//*[contains(@name, 'divName')]"
xpath = dom.createXPath(xpathValue)
if xpath != None:
nodes = xpath.selectNodes(dom)
if len(nodes) > 0:
for node in nodes:
node.detach()
This seems to remove the div fine, I noticed that it also removes elements and text within that div also. What I am looking to achive is removing the div without removing the elements and text inside the div, resulting in this:
Some Text Here
<span>Some Text Here</span>
Is it possible to achive this with dom4j? If not any suggestions on how to go about this?
Cheers
Eef
Update:
@alamar
I have achived what I wanted by taking your code and editing it a little and this is what I have come up with:
xpathValue = "//*[contains(@name, 'divName')]"
xpath = dom.createXPath(xpathValue)
if xpath != None:
nodes = xpath.selectNodes(dom)
if len(nodes) > 0:
for node in nodes:
parent = node.getParent()
nodeContents = node.content()
if len(nodeContents) > 0:
for subNode in nodeContents:
parent.add(subNode.clone().detach())
node.detach()
This seems to work, but adds the nodes to the end of the parent node in the below situation:
<div name="parent">
<div name="divName">
Some Text Here
<span>Some Text Here</span>
</div>
<div name="keep"></div>
</div>
The result is this:
<div name="parent">
<div name="keep"></div>
Some Text Here
<span>Some Text Here</span>
</div>
I am trying to figure out how to get the contents of the removed node to stay in its original position, before thed div named "keep", instead of being added after the div with the name "keep". I have tried a few thing but can not seem achive this, could anyone help?
Eef
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果您想保持元素的顺序,您应该向
parent
询问其content()
。在该
content
(这是一个由父元素支持的 List)集合中,您应该找到您的div
并将其替换为该 div 的content()
。坦率地说,我不记得在 python 中执行此操作的惯用方法。
大概
If you want to keep the order of elements, you should really ask
parent
for itscontent()
.In that
content
(which is a List backed by parent element) collection, you should find yourdiv
and replace it with that div'scontent()
.I don't remember idiomatic way to do that in python, frankly.
probably
尝试:
我相信它会成功。
即在分离每个 div 后,您应该将每个 div 的子级重新附加到 div 的父级中。
Try:
I believe it would do the trick.
I.e. after detaching every div, you should reattach every div's child into div's parent.
我有一个类似的问题并用以下函数解决了它(对我来说效果很好)
它在做什么:它将简单地删除该父标签并将元素内的每个元素和节点包含到该位置的父元素。
享受cnsntrk
i had a similar problem and solved it with the following function (works fine for me)
What is it doing: it will simply remove that parent tag and includes every element and node inside the element to the parent at that position.
enjoy cnsntrk