如何获得多个消息?
我正在尝试通过以下代码从同一日期开始 scrape 多个WhatsApp消息。但是,这仅给出该日期的第一个消息( 4/21/2022 )
嘿(消息1)
你好吗? (消息2)
WBU? (消息3)
产生的输出是
hey(消息1)
嘿(消息1)
嘿(消息1)
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
day = input("Enter date: ")
month = input("Enter month: ")
year = input("Enter year: ")
date = month + "/" + day + "/" + year
driver = webdriver.Chrome()
driver.get("https://web.whatsapp.com/")
WebDriverWait(driver, 60).until(
EC.text_to_be_present_in_element(
(By.CLASS_NAME, '_1vjYt'), 'WhatsApp Web'
)
)
listContact = []
with open('cont.txt', 'r') as f:
for line in f:
line = line.replace('\n', '')
listContact.append(line)
for contact in listContact:
driver.implicitly_wait(10)
hotel = driver.find_element(By.XPATH, '//span[@title="{}"]'.format(contact))
hotel.click()
driver.implicitly_wait(10)
while (driver.find_element(
By.CSS_SELECTOR, 'div[data-pre-plain-text*="{}"]'.format(date))):
messages = driver.find_element(
By.CSS_SELECTOR, 'div[data-pre-plain-text*="{}"]'.format(date))
print(messages.text)
html编码 is下列的:
<div class="_2jGOb copyable-text" data-pre-plain-text="[2:39 PM, 5/1/2022] Joseph: ">
<div class="_1Gy50">
<span dir="ltr" class="i0jNr selectable-text copyable-text">
<span>
Hey, there
</span>
</span>
</div>
</div>
<div class="_2jGOb copyable-text" data-pre-plain-text="[2:40 PM, 5/1/2022] Joseph: ">
<div class="_1Gy50">
<span dir="ltr" class="i0jNr selectable-text copyable-text">
<span>
How are you?
</span>
</span>
</div>
</div>
<div class="_2jGOb copyable-text" data-pre-plain-text="[2:39 PM, 5/1/2022] Joseph: ">
<div class="_1Gy50">
<span dir="ltr" class="i0jNr selectable-text copyable-text">
<span>
WBU?
</span>
</span>
</div>
</div>
I am trying to scrape multiple WhatsApp messages from the same date by the following code. However, this only gives the first message of that date (4/21/2022) For instance:
Required output should be:
Hey there (message 1)
How are you? (message 2)
WBU? (message 3)
Resulting output is
Hey there (message 1)
Hey there (message 1)
Hey there (message 1)
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
day = input("Enter date: ")
month = input("Enter month: ")
year = input("Enter year: ")
date = month + "/" + day + "/" + year
driver = webdriver.Chrome()
driver.get("https://web.whatsapp.com/")
WebDriverWait(driver, 60).until(
EC.text_to_be_present_in_element(
(By.CLASS_NAME, '_1vjYt'), 'WhatsApp Web'
)
)
listContact = []
with open('cont.txt', 'r') as f:
for line in f:
line = line.replace('\n', '')
listContact.append(line)
for contact in listContact:
driver.implicitly_wait(10)
hotel = driver.find_element(By.XPATH, '//span[@title="{}"]'.format(contact))
hotel.click()
driver.implicitly_wait(10)
while (driver.find_element(
By.CSS_SELECTOR, 'div[data-pre-plain-text*="{}"]'.format(date))):
messages = driver.find_element(
By.CSS_SELECTOR, 'div[data-pre-plain-text*="{}"]'.format(date))
print(messages.text)
The HTML coding is following:
<div class="_2jGOb copyable-text" data-pre-plain-text="[2:39 PM, 5/1/2022] Joseph: ">
<div class="_1Gy50">
<span dir="ltr" class="i0jNr selectable-text copyable-text">
<span>
Hey, there
</span>
</span>
</div>
</div>
<div class="_2jGOb copyable-text" data-pre-plain-text="[2:40 PM, 5/1/2022] Joseph: ">
<div class="_1Gy50">
<span dir="ltr" class="i0jNr selectable-text copyable-text">
<span>
How are you?
</span>
</span>
</div>
</div>
<div class="_2jGOb copyable-text" data-pre-plain-text="[2:39 PM, 5/1/2022] Joseph: ">
<div class="_1Gy50">
<span dir="ltr" class="i0jNr selectable-text copyable-text">
<span>
WBU?
</span>
</span>
</div>
</div>
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
在获得相同的输出时,最后一个()周期会更好地重写,
因为周期开始新的独立迭代。
the last while() cycle would be better rewritten as
You are getting the same output because the body of while cycle starts new independent iteration.
find_element
(最后没有s
)在页面上总是找到第一个元素 - 使用多少次都没关系。您必须使用
find_elements
(最后使用s
)以获取所有元素 - 然后以 -loop的方式使用find_element
(withouts
at the end) is finding always only first element on page - and it doesn't matter how many times you use it.you have to use
find_elements
(withs
at the end) to get all elements - and later usefor
-loop