如何使用Python硒从跨元素中提取多个文本？

发布于 2025-01-26 16:14:39 字数 1593 浏览 2 评论 0原文

我正在尝试使用Selenium Webdriver方法中的以下HTML代码将SPAN中的所有文本提取到列表中：

['1a', '1b', '1c', '2a', ' ', ' ', '3a', '3b', '3c', '4a', ' ', ' ']

有人专家知道该怎么做吗？

html：

<tr style="background-color:#999">
    <td><b style="white-space: nowrap;">table_num</b><enter code here/td>
        <td style="text-align:center;">
            <span style="flex: 1;display: flex;flex-direction: column;">
                <span>1a</span>
                <span>1b</span>
                <span>1c</span>
                </span>
        </td>
        <td style="text-align:center;">
            <span style="flex: 1;display: flex;flex-direction: column;">
                <span>2a</span>
                <span>　　　　　</span>
                <span>　　　　　</span>
           </span>
        </td>
        <td style="text-align:center;">
            <span style="flex: 1;display: flex;flex-direction: column;">
                <span>3a</span>
                <span>3b</span>
                <span>3c</span>
            </span>
        </td>
        <td style="text-align:center;">
            <span style="flex: 1;display: flex;flex-direction: column;">
                <span>4a</span>
                <span>　　　　　</span>
                <span>　　　　　</span>
            </span>
        </td>
</tr>

原文

I am trying to extract all the texts in span into list, using the following HTML code from Selenium webdriver method:

['1a', '1b', '1c', '2a', ' ', ' ', '3a', '3b', '3c', '4a', ' ', ' ']

Anyone expert know how to do it?

HTML:

<tr style="background-color:#999">
    <td><b style="white-space: nowrap;">table_num</b><enter code here/td>
        <td style="text-align:center;">
            <span style="flex: 1;display: flex;flex-direction: column;">
                <span>1a</span>
                <span>1b</span>
                <span>1c</span>
                </span>
        </td>
        <td style="text-align:center;">
            <span style="flex: 1;display: flex;flex-direction: column;">
                <span>2a</span>
                <span>　　　　　</span>
                <span>　　　　　</span>
           </span>
        </td>
        <td style="text-align:center;">
            <span style="flex: 1;display: flex;flex-direction: column;">
                <span>3a</span>
                <span>3b</span>
                <span>3c</span>
            </span>
        </td>
        <td style="text-align:center;">
            <span style="flex: 1;display: flex;flex-direction: column;">
                <span>4a</span>
                <span>　　　　　</span>
                <span>　　　　　</span>
            </span>
        </td>
</tr>

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

混吃等死 2025-02-02 16:14:39

这是方法，请使用以下XPath，它将为您提供所有必需的跨度。

//span[contains(@style,"column")]/span

一旦拥有所有跨度，就必须从中提取文本。

如果有空文本，请忽略或将其添加到列表中。

Here is the way, use the below xpath which will give you all the required spans.

//span[contains(@style,"column")]/span

Once you have all the span, you have to extract text from it.

If there is empty text, then ignore or else add it in the list.

回复收藏 0 原文

独闯女儿国 2025-02-02 16:14:39

根据html，要从 ＆lt; span＆gt; 元素中提取所有文本中，您必须诱导 https://stackoverflow.com/a/59130336/7429447"> webdriverwait > 并使用 list consection 您可以使用以下任何一个 定位器策略 ：

使用 css_selector使用 css_selector /em>和 text 属性：

driver.get（“应用程序URL”） print（[[my_elem.text for my_elem in webdriverwait（驱动程序，20）.until（ec.visibility_of_all_elements_located（（（（by.css_selector））

使用 xpath 和 get_attribute（“ innerhtml”） < /em>：

driver.get（“应用程序URL”） print（[[my_elem.get_attribute（“ innerhtml”）for webDriverWait中的my_elem（驱动程序，20）.until（ec.visibility_of_all_ellements_located（（（by.xpath，” // td // td/span // span”）））））））））

As per the HTML, to extract all the texts from the <span> elements into a list you have to induce WebDriverWait for visibility_of_all_elements_located() and using List Comprehension you can use either of the following locator strategies:

Using CSS_SELECTOR and text attribute:

driver.get("application url")
print([my_elem.text for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "tr[style^='background'] > td td > span span")))])

Using XPATH and get_attribute("innerHTML"):

driver.get("application url")     
print([my_elem.get_attribute("innerHTML") for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//tr[starts-with(@style, 'background')]/td//td/span//span")))])

回复收藏 0 原文

沫尐诺 2025-02-02 16:14:39

只需从XPath中删除谓词[1]，它就会变成：

//td[contains(.,'table_num')]/following-sibling::td

EN可以更确切地说：您可以使用：

//td[contains(.,'table_num')]/following-sibling::td/span/span

Just remove the predicate [1] from XPath, so it becomes:

//td[contains(.,'table_num')]/following-sibling::td

En to be more precise you could use:

//td[contains(.,'table_num')]/following-sibling::td/span/span

回复收藏 0 原文

~没有更多了~

关于作者

鱼忆七猫命九

暂无简介

文章

25 人气

关注发私信

友情链接

文江博客

如何使用Python硒从跨元素中提取多个文本？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

十二

飞烟轻若梦

OPleyuhuo

wxb0109

旧城空念

-小熊_

友情链接

如何使用Python硒从跨元素中提取多个文本？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

十二

飞烟轻若梦

OPleyuhuo

wxb0109

旧城空念

-小熊_

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。