使用 XPath 获取第二个元素文本?
<span class='python'>
<a>google</a>
<a>chrome</a>
</span>
我想要 chrome 并让它像这样工作。
q = item.findall('.//span[@class="python"]//a')
t = q[1].text # first element = 0
我想将它组合成一个 XPath 表达式,并且只获取一项而不是列表。
我尝试了这个,但它不起作用。
t = item.findtext('.//span[@class="python"]//a[2]') # first element = 1
实际的而不是简化的 HTML 是这样的。
<span class='python'>
<span>
<span>
<img></img>
<a>google</a>
</span>
<a>chrome</a>
</span>
</span>
<span class='python'>
<a>google</a>
<a>chrome</a>
</span>
I want to get chrome
and have it working like this already.
q = item.findall('.//span[@class="python"]//a')
t = q[1].text # first element = 0
I'd like to combine it into a single XPath expression and just get one item instead of a list.
I tried this but it doesn't work.
t = item.findtext('.//span[@class="python"]//a[2]') # first element = 1
And the actual, not simplified, HTML is like this.
<span class='python'>
<span>
<span>
<img></img>
<a>google</a>
</span>
<a>chrome</a>
</span>
</span>
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
这是有关
//
缩写的常见问题解答。.//a[2]
表示:选择当前节点的所有a
后代,并且是其父节点的第二个a
子节点。因此,这可能会选择多个元素或不选择任何元素——具体取决于具体的 XML 文档。更简单地说,
[]
运算符的优先级高于//
。如果您只想返回所有节点中的一个(第二个),则必须使用括号来强制您想要的优先级:
(.//a)[2]
这实际上选择了第二个
a< /code> 当前节点的后代。
问题中实际使用的表达方式,改为:
或改为:
This is a FAQ about the
//
abbreviation..//a[2]
means: Select alla
descendents of the current node that are the seconda
child of their parent. So this may select more than one element or no element -- depending on the concrete XML document.To put it more simply, the
[]
operator has higher precedence than//
.If you want just one (the second) of all nodes returned you have to use brackets to force your wanted precedence:
(.//a)[2]
This really selects the second
a
descendent of the current node.For the actual expression used in the question, change it to:
or change it to:
我不确定问题是什么......
I'm not sure what the problem is...
来自评论:
你是对的。
.//span[@class="python"]//a[2]
是什么意思?这将扩展为:它将最终选择第二个
a
子节点(fn:position()
指的是child
轴)。因此,如果您的文档如下所示,则不会选择任何内容:如果您想要所有后代中的第二个,请使用:
From Comments:
You are right. What is the meaning of
.//span[@class="python"]//a[2]
? This will be expanded to:It will finaly select the second
a
child (fn:position()
refers to thechild
axe). So, nothing will be select if your document is like:If you want the second of all descendants, use: