selenium xpath 混合内容 html span 的抓取

发布于 2024-12-04 05:10:39 字数 644 浏览 2 评论 0原文

我正在尝试抓取具有混合内容的跨度元素

<span id="span-id">
  <!--starts with some whitespace-->
  <b>bold title</b>
  <br/>
  text here that I want to grab....
</span>

，这是识别跨度的抓取代码片段。它可以毫无问题地拾取它，但网络元素的文本字段是空白的。

IWebDriver driver = new FirefoxDriver();
driver.Navigate().GoToUrl("http://page-to-examine.com");
var query = driver.FindElement(By.XPath("//span[@id='span-id']"));

我尝试将 /text() 添加到表达式中，但它也不返回任何内容。如果我添加 /b 我确实得到了粗体文本的文本内容 - 这恰好是我不感兴趣的标题。

我确信使用一点 xpath 魔法这应该很容易，但我没有找到到目前为止！或者有更好的方法吗？如有任何意见，我们深表谢意。

原文

I'm trying to scrape a span element that has mixed content

<span id="span-id">
  <!--starts with some whitespace-->
  <b>bold title</b>
  <br/>
  text here that I want to grab....
</span>

And here's a code snippet of a grab that identifies the span. It picks it up without a problem but the text field of the webelement is blank.

IWebDriver driver = new FirefoxDriver();
driver.Navigate().GoToUrl("http://page-to-examine.com");
var query = driver.FindElement(By.XPath("//span[@id='span-id']"));

I've tried adding /text() to the expression which also returns nothing. If I add /b I do get the text content of the bolded text - which happens to be a title that I'm not interested in.

I'm sure with a bit of xpath magic this should be easy but I'm not finding it so far!! Or is there a better way? Any comments gratefully received.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

西瓜 2024-12-11 05:10:39

我尝试将 /text() 添加到表达式中，但它也不返回任何内容

这将选择上下文节点的所有文本节点子节点 - 并且有以下三个他们。

您所指的“无”很可能是其中的第一个，它是一个仅包含空格的文本节点（因此您在其中看到“无”）。

您需要的是：

//span[@id='span-id']/text()[3]

当然，还有其他可能的变化：

//span[@id='span-id']/text()[last()]

或者：

//span[@id='span-id']/br/following-sibling::text()[1]

基于 XSLT 的验证：

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="node()|@*">
     "<xsl:copy-of select="//span[@id='span-id']/text()[3]"/>"
 </xsl:template>

</xsl:stylesheet>

此转换只是输出任何内容XPath 表达式选择。当应用于提供的 XML 文档时（注释已删除）：

<span id="span-id">
    <b>bold title</b>
    <br/>
    text here that I want to grab....   
</span>

产生了想要的结果：

     "
    text here that I want to grab....   
"

I've tried adding /text() to the expression which also returns nothing

This selects all the text-node-children of the context node -- and there are three of them.

What you refer to "nothing" is most probably the first of these, which is a white-space-only text node (thus you see "nothing" in it).

What you need is:

//span[@id='span-id']/text()[3]

Of course, there are other variations possible:

//span[@id='span-id']/text()[last()]

Or:

//span[@id='span-id']/br/following-sibling::text()[1]

XSLT-based verification:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="node()|@*">
     "<xsl:copy-of select="//span[@id='span-id']/text()[3]"/>"
 </xsl:template>

</xsl:stylesheet>

This transformation simply outputs whatever the XPath expression selects. When applied on the provided XML document (comment removed):

<span id="span-id">
    <b>bold title</b>
    <br/>
    text here that I want to grab....   
</span>

the wanted result is produced:

     "
    text here that I want to grab....   
"

回复收藏 0 原文

深爱不及久伴 2024-12-11 05:10:39

我相信以下 xpath 查询应该适合您的情况。 follow-sibling 对于您想要做的事情很有用。

//span[@id='span-id']/br/following-sibling::text()

I believe the following xpath query should work for your case. following-sibling useful for what you're trying to do.

//span[@id='span-id']/br/following-sibling::text()

回复收藏 0 原文

~没有更多了~

关于作者

驱逐舰岛风号

暂无简介

0 文章

0 评论

25 人气

关注发私信

友情链接

文江博客

selenium xpath 混合内容 html span 的抓取

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

醉城メ夜风

远昼

平生欢

微凉

Honwey

qq_ikhFfg

友情链接

selenium xpath 混合内容 html span 的抓取

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

醉城メ夜风

远昼

平生欢

微凉

Honwey

qq_ikhFfg

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。