硒:在其中嵌套的标签内获取文本

发布于 2025-02-13 02:12:22 字数 894 浏览 1 评论 0原文

我有一个元素,

<div class="ProductVariants__PriceContainer-sc-1unev4j-9 jjiIua">
    ₹199 
    <span class="ProductVariants__MRPText-sc-1unev4j-10 jEinXG">
        ₹690
    </span>
    <div class="Product__Dicount">
        No discount available for this product
    </div>
</div>

可以说,当我通过className获取元素时,

div_containing_radio = driver.find_element(by=By.XPATH, value="//div[starts-with(@class, 'ProductVariants__RadioButtonInner')]//ancestor::div[starts-with(@class, 'ProductVariants__VariantCard')]")
div_containing_radio.find_element(by=By.CSS_SELECTOR, value=".ProductVariants__PriceContainer-sc-1unev4j-9.jjiIua").text

这给了我

'₹199 ₹690 No discount available for this product'

我想要的只是

请注意,随着页面的结构不断变化,我不能仅仅格式化文本,并通过space拆分。

Lets say I have an element

<div class="ProductVariants__PriceContainer-sc-1unev4j-9 jjiIua">
    ₹199 
    <span class="ProductVariants__MRPText-sc-1unev4j-10 jEinXG">
        ₹690
    </span>
    <div class="Product__Dicount">
        No discount available for this product
    </div>
</div>

When I am fetching the element by classname

div_containing_radio = driver.find_element(by=By.XPATH, value="//div[starts-with(@class, 'ProductVariants__RadioButtonInner')]//ancestor::div[starts-with(@class, 'ProductVariants__VariantCard')]")
div_containing_radio.find_element(by=By.CSS_SELECTOR, value=".ProductVariants__PriceContainer-sc-1unev4j-9.jjiIua").text

This gives me

'₹199 ₹690 No discount available for this product'

What I wanted was just ₹199.

Note that I can't just format the text and get the first text on split by space as the structure of the page keeps changing.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

梦幻之岛 2025-02-20 02:12:22

使用Little Bit JS:

js_query = """
            var x = document.querySelector('.ProductVariants__PriceContainer-sc-1unev4j-9.jjiIua').childNodes;
            var l = "";
    
            x.forEach(i => {
                if (i.nodeName === '#text') {
                    l += ' ' + i.textContent;
                }
            });
            return l;
"""

price = driver.execute_script(js_query).strip()
print(price)

输出:

₹199

我们使用JS的工作是我们正在获取目标DIV元素的所有子节点。然后,我们将通过所有这些节点进行迭代,并仅从 text 节点中获取textContent值。同时,我们将所有这些值添加到字符串类型变量l中。我们从JS返回l,然后将其从Python中的无用字符中剥离。就是这样。

Using little bit JS:

js_query = """
            var x = document.querySelector('.ProductVariants__PriceContainer-sc-1unev4j-9.jjiIua').childNodes;
            var l = "";
    
            x.forEach(i => {
                if (i.nodeName === '#text') {
                    l += ' ' + i.textContent;
                }
            });
            return l;
"""

price = driver.execute_script(js_query).strip()
print(price)

Output:

₹199

What we are doing with JS is we are fetching all the child nodes of our target div element. Then we are iterating through all of these nodes and getting textContent values from text nodes only. Simultaneously, we are adding all those values into a string type variable l. We return l from JS and strip it off of useless characters in Python. That's it.

小草泠泠 2025-02-20 02:12:22

@firelord(+1)的答案可以简化为

div_containing_radio = driver.find_element(by=By.XPATH, value="//div[starts-with(@class, 'ProductVariants__RadioButtonInner')]//ancestor::div[starts-with(@class, 'ProductVariants__VariantCard')]")
price = div_containing_radio.find_element(by=By.CSS_SELECTOR, value=".ProductVariants__PriceContainer-sc-1unev4j-9.jjiIua")

print(driver.execute_script("return arguments[0].firstChild.textContent;", price).strip())

Answer of @Firelord (+1) can be simplified as

div_containing_radio = driver.find_element(by=By.XPATH, value="//div[starts-with(@class, 'ProductVariants__RadioButtonInner')]//ancestor::div[starts-with(@class, 'ProductVariants__VariantCard')]")
price = div_containing_radio.find_element(by=By.CSS_SELECTOR, value=".ProductVariants__PriceContainer-sc-1unev4j-9.jjiIua")

print(driver.execute_script("return arguments[0].firstChild.textContent;", price).strip())
萧瑟寒风 2025-02-20 02:12:22

要仅打印 199 从字符串1990卢比690卢比 您只需要针对整个字符串而相对于 €r 并打印第二个元素,如下所示:

print(div_containing_radio.find_element(by=By.CSS_SELECTOR, value=".ProductVariants__PriceContainer-sc-1unev4j-9.jjiIua").text.split("₹")[1])

作为替代方案,您还可以将字符串相对于 blankspace 并将第一个元素打印为以下内容:

print(div_containing_radio.find_element(by=By.CSS_SELECTOR, value=".ProductVariants__PriceContainer-sc-1unev4j-9.jjiIua").text.split(" ")[0])    

To print only 199 from the string ₹199 ₹690 No discount available for this product you just need to split the entire string with respect to the and print the second element as follows:

print(div_containing_radio.find_element(by=By.CSS_SELECTOR, value=".ProductVariants__PriceContainer-sc-1unev4j-9.jjiIua").text.split("₹")[1])

As an alternative you can also split the string with respect to the blankspace and print the first element as follows:

print(div_containing_radio.find_element(by=By.CSS_SELECTOR, value=".ProductVariants__PriceContainer-sc-1unev4j-9.jjiIua").text.split(" ")[0])    
夢归不見 2025-02-20 02:12:22

尝试以下操作:

div_containing_radio = driver.find_element(by=By.XPATH,"//div[starts-with(@class, 'ProductVariants__PriceContainer-sc-1unev4j-9 jjiIua')]/following-sibling::text()")

Try these:

div_containing_radio = driver.find_element(by=By.XPATH,"//div[starts-with(@class, 'ProductVariants__PriceContainer-sc-1unev4j-9 jjiIua')]/following-sibling::text()")
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文