如何将所有文本与xpath内部的div path在div path

发布于 2025-02-08 18:32:50 字数 1826 浏览 0 评论 0原文

我想在此处使用XPATH的DIV中获取所有文本

HTML代码:

<div class="JobDescriptionsc__DescriptionContainer-sc-1jylha1-2 dGyoDf">
 <div class="DraftEditorContainersc__DraftEditorContainer-sc-1x4uima-0 cGUaQf">
  <div class="DraftEditor-root">
   <div class="DraftEditor-editorContainer">
    <div class="public-DraftEditor-content" contenteditable="false" spellcheck="false" style="outline:none;user-select:text;-webkit-user-select:text;white-space:pre-wrap;word-wrap:break-word">
     <div data-contents="true">
       #Here the all text
        <div class="" data-block="true" data-editor="d54la" data-offset-key="bhkoa-0-0">
         <div data-offset-key="bhkoa-0-0" class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr">
          <span data-offset-key="bhkoa-0-0" style="font-weight:bold">
           <span data-text="true">Job Description:</span>
          </span>
         </div>
        </div>
        <div class="" data-block="true" data-editor="d54la" data-offset-key="51e5u-0-0">
         <div data-offset-key="51e5u-0-0" class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr">
          <span data-offset-key="51e5u-0-0">
           <span data-text="true">· Identify &amp; developed application base on predefined business requirements.</span>
           </span>
          </div>
         </div>
         ...
         #there's more, I'm just showing you a few
     </div>
    </div>
   </div>
  </div>
 </div>
</div>

这是我的XPATH代码:

dom_job.xpath('//*[@class="DraftEditorContainersc__DraftEditorContainer-sc-1x4uima-0 cGUaQf"]//text()') 

我需要XPath Div Path的所有文本,可以吗?

I want to get all text inside a div with xpath

Here HTML code:

<div class="JobDescriptionsc__DescriptionContainer-sc-1jylha1-2 dGyoDf">
 <div class="DraftEditorContainersc__DraftEditorContainer-sc-1x4uima-0 cGUaQf">
  <div class="DraftEditor-root">
   <div class="DraftEditor-editorContainer">
    <div class="public-DraftEditor-content" contenteditable="false" spellcheck="false" style="outline:none;user-select:text;-webkit-user-select:text;white-space:pre-wrap;word-wrap:break-word">
     <div data-contents="true">
       #Here the all text
        <div class="" data-block="true" data-editor="d54la" data-offset-key="bhkoa-0-0">
         <div data-offset-key="bhkoa-0-0" class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr">
          <span data-offset-key="bhkoa-0-0" style="font-weight:bold">
           <span data-text="true">Job Description:</span>
          </span>
         </div>
        </div>
        <div class="" data-block="true" data-editor="d54la" data-offset-key="51e5u-0-0">
         <div data-offset-key="51e5u-0-0" class="public-DraftStyleDefault-block public-DraftStyleDefault-ltr">
          <span data-offset-key="51e5u-0-0">
           <span data-text="true">· Identify & developed application base on predefined business requirements.</span>
           </span>
          </div>
         </div>
         ...
         #there's more, I'm just showing you a few
     </div>
    </div>
   </div>
  </div>
 </div>
</div>

This my XPath code:

dom_job.xpath('//*[@class="DraftEditorContainersc__DraftEditorContainer-sc-1x4uima-0 cGUaQf"]//text()') 

I need the all text inside the div parent with xpath, can it?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

倒数 2025-02-15 18:32:50

我假设为您的XPath解释器提供支持XPath版本1的Python模块。下面的XPATH表达式返回所有文本节点的集合,这些节点是div element的后代的所有文本节点:

//*[@class="DraftEditorContainersc__DraftEditorContainer-sc-1x4uima-0 cGUaQf"]//text()

您应该能够迭代所有这些文本节点的集合,并将它们串成一个python中的单个字符串。

但这更简单,如果您希望特定div中的文本节点的串联值,只需将xpath string()函数应用于div ;例如:

string(//*[@class="DraftEditorContainersc__DraftEditorContainer-sc-1x4uima-0 cGUaQf"])

参见 https:/ tr/1999/rec-xpath-199991116/#函数 - 字符串

请注意,在xpath 1中,如果您将string()函数应用于较大的节点(例如由您的第一个查询返回的一组文本节点),该函数将仅返回的字符串值仅第一个节点。

I'm assuming the Python module which provides your XPath interpreter supports XPath version 1. Your XPath expression below returns the set of all text nodes which are descendants of the div element:

//*[@class="DraftEditorContainersc__DraftEditorContainer-sc-1x4uima-0 cGUaQf"]//text()

You should be able to iterate over all that collection of text nodes, and concatenate them into a single string, in Python.

But it's simpler, if you want the concatenated value of the text nodes within a particular div, to just apply the XPath string() function to the div; e.g.:

string(//*[@class="DraftEditorContainersc__DraftEditorContainer-sc-1x4uima-0 cGUaQf"])

See https://www.w3.org/TR/1999/REC-xpath-19991116/#function-string

Note that, in XPath 1, if you apply the string() function to a larger set of nodes (such as the set of text nodes returned by your first query), the function will return the string value of just the first node.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文