在 XPath 中获取(文本)
我有以下 DOM 结构/HTML,我想获取(只是练习......)标记的数据。
h2 元素下的那个。该 div[@class="coordsAgence"] 元素,下面有更多的 div 子元素和更多的 h2 元素。这样做:
div[@class="coordsAgence"]
将获得该值,但带有额外的不需要的文本。 更新:我基本上想要的值(在此示例中)是:“GALLIER Dennis”文本。
I have the following DOM structure / HTML, I want to get (just practicing...) the marked data.
The one that is under the h2 element. that div[@class="coordsAgence"] element, has some more div children below and some more h2's.. so doing:
div[@class="coordsAgence"]
Will get that value, but with additional unneeded text.
UPDATE: The value (From this example) that I basically want is that: "GALLIER Dennis" text.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
看来您想要该 div 中的第一个文本节点:
应该这样做。
请注意,这假设
内的这些注释之间实际上没有空格。否则该空格将构成您必须考虑的附加文本节点。
It seems you want the first text node in that div:
should do it.
Note that this assumes that there is actually no whitespace between those comments inside
<div class="coordsAgence">
; otherwise that whitespace will constitute additional text nodes that you'll have to account for.使用类
"coordsAgence"
获取div
中第一个h2
之后的第一个文本节点:请注意,第一个表达式返回之后的第一个文本节点第一个
h2
即使两者之间出现其他节点。如果您只想在紧接第一个h2
之后的节点时返回文本,请尝试如下操作:Get the first text node following the first
h2
in thediv
with class"coordsAgence"
:Note that this first expression returns the first text node after the first
h2
even when some other node appears between the two. If you want to return the text only when it's the node that immediately follows the firsth2
, then try something like this:使用Python/Scrapy从h1标签获取文本(例如):
using Python/Scrapy to get text from h1 tag(for example):