如何获得多个消息?

发布于 2025-01-25 17:58:22 字数 2458 浏览 2 评论 0原文

我正在尝试通过以下代码从同一日期开始 scrape 多个WhatsApp消息。但是,这仅给出该日期的第一个消息( 4/21/2022

嘿(消息1)

你好吗? (消息2)

WBU? (消息3)

产生的输出

hey(消息1)

嘿(消息1)

嘿(消息1)

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

day = input("Enter date: ")
month = input("Enter month: ")
year = input("Enter year: ")
date = month + "/" + day + "/" + year

driver = webdriver.Chrome()
driver.get("https://web.whatsapp.com/")

WebDriverWait(driver, 60).until(
    EC.text_to_be_present_in_element(
        (By.CLASS_NAME, '_1vjYt'), 'WhatsApp Web'
    )
)


listContact = []
with open('cont.txt', 'r') as f:
    for line in f:
        line = line.replace('\n', '')
        listContact.append(line)

for contact in listContact:
    driver.implicitly_wait(10)
    hotel = driver.find_element(By.XPATH, '//span[@title="{}"]'.format(contact))
    hotel.click()
    driver.implicitly_wait(10)

    while (driver.find_element(
           By.CSS_SELECTOR, 'div[data-pre-plain-text*="{}"]'.format(date))):
        messages = driver.find_element(
           By.CSS_SELECTOR, 'div[data-pre-plain-text*="{}"]'.format(date))
           print(messages.text)

html编码 is下列的:


<div class="_2jGOb copyable-text" data-pre-plain-text="[2:39 PM, 5/1/2022] Joseph: ">
    <div class="_1Gy50">
        <span dir="ltr" class="i0jNr selectable-text copyable-text">
            <span>
                Hey, there
            </span>
        </span>
    </div>
</div>

<div class="_2jGOb copyable-text" data-pre-plain-text="[2:40 PM, 5/1/2022] Joseph: ">
    <div class="_1Gy50">
        <span dir="ltr" class="i0jNr selectable-text copyable-text">
            <span>
                How are you?
            </span>
        </span>
    </div>
</div>

<div class="_2jGOb copyable-text" data-pre-plain-text="[2:39 PM, 5/1/2022] Joseph: ">
    <div class="_1Gy50">
        <span dir="ltr" class="i0jNr selectable-text copyable-text">
            <span>
                WBU?
            </span>
        </span>
    </div>
</div>

I am trying to scrape multiple WhatsApp messages from the same date by the following code. However, this only gives the first message of that date (4/21/2022) For instance:

Required output should be:

Hey there (message 1)

How are you? (message 2)

WBU? (message 3)

Resulting output is

Hey there (message 1)

Hey there (message 1)

Hey there (message 1)

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

day = input("Enter date: ")
month = input("Enter month: ")
year = input("Enter year: ")
date = month + "/" + day + "/" + year

driver = webdriver.Chrome()
driver.get("https://web.whatsapp.com/")

WebDriverWait(driver, 60).until(
    EC.text_to_be_present_in_element(
        (By.CLASS_NAME, '_1vjYt'), 'WhatsApp Web'
    )
)


listContact = []
with open('cont.txt', 'r') as f:
    for line in f:
        line = line.replace('\n', '')
        listContact.append(line)

for contact in listContact:
    driver.implicitly_wait(10)
    hotel = driver.find_element(By.XPATH, '//span[@title="{}"]'.format(contact))
    hotel.click()
    driver.implicitly_wait(10)

    while (driver.find_element(
           By.CSS_SELECTOR, 'div[data-pre-plain-text*="{}"]'.format(date))):
        messages = driver.find_element(
           By.CSS_SELECTOR, 'div[data-pre-plain-text*="{}"]'.format(date))
           print(messages.text)

The HTML coding is following:


<div class="_2jGOb copyable-text" data-pre-plain-text="[2:39 PM, 5/1/2022] Joseph: ">
    <div class="_1Gy50">
        <span dir="ltr" class="i0jNr selectable-text copyable-text">
            <span>
                Hey, there
            </span>
        </span>
    </div>
</div>

<div class="_2jGOb copyable-text" data-pre-plain-text="[2:40 PM, 5/1/2022] Joseph: ">
    <div class="_1Gy50">
        <span dir="ltr" class="i0jNr selectable-text copyable-text">
            <span>
                How are you?
            </span>
        </span>
    </div>
</div>

<div class="_2jGOb copyable-text" data-pre-plain-text="[2:39 PM, 5/1/2022] Joseph: ">
    <div class="_1Gy50">
        <span dir="ltr" class="i0jNr selectable-text copyable-text">
            <span>
                WBU?
            </span>
        </span>
    </div>
</div>

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

拧巴小姐 2025-02-01 17:58:22

在获得相同的输出时,最后一个()周期会更好地重写,

elements = driver.find_element(
           By.CSS_SELECTOR, 'div[data-pre-plain-text*="{}"]'.format(date))
for e in elements:
    print(e.text)

因为周期开始新的独立迭代。

the last while() cycle would be better rewritten as

elements = driver.find_element(
           By.CSS_SELECTOR, 'div[data-pre-plain-text*="{}"]'.format(date))
for e in elements:
    print(e.text)

You are getting the same output because the body of while cycle starts new independent iteration.

紫罗兰の梦幻 2025-02-01 17:58:22

find_element(最后没有s)在页面上总是找到第一个元素 - 使用多少次都没关系。

您必须使用find_elements(最后使用s)以获取所有元素 - 然后以 -loop的方式使用

css = 'div[data-pre-plain-text*="{}"]'.format(date)

elements = driver.find_elements(By.CSS_SELECTOR, css)

for e in elements:
    print(e.text)

find_element (without s at the end) is finding always only first element on page - and it doesn't matter how many times you use it.

you have to use find_elements (with s at the end) to get all elements - and later use for-loop

css = 'div[data-pre-plain-text*="{}"]'.format(date)

elements = driver.find_elements(By.CSS_SELECTOR, css)

for e in elements:
    print(e.text)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文