如何抓取Google People的问题和答案,还向Selenium和Python询问比Google默认输出的数量?

发布于 2025-01-28 23:07:28 字数 194 浏览 1 评论 0原文

我找到了一个很好的解决方案,但它可以默认情况下Google给出的问题和答案数量,但是例如i需要更多。

我是Python的新手开发人员。 我如何获得更多问题和答案? 我是否必须先点击以披露所需的金额然后解析?

I found a good solution, but it works on the number of questions and answers that Google gives by default, but for example I need more.

I am a novice developer on Python.
How do I get more questions and answers?
Do I have to implement a click first to disclose the required amount and then parse?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

掩于岁月 2025-02-04 23:07:28

以下代码解析出现在屏幕上的问题,然后询问您是否要解析更多问题。如果输入y,则单击“最后一个问题”按钮,以便在页面中加载更多内容。这些问题存储在列表问题,列表中的答案答案

import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service

your_path = '...'
driver = webdriver.Chrome(service=Service(your_path))

driver.get('https://www.google.com/search?q=How%20to%20make%20bakery%3F&source=hp&ei=j0aZYYjRAvja2roPrcWcyAU&iflsig=ALs-wAMAAAAAYZlUn4NMUPjfIpQmrXSmjIDnaWjJXWIJ&ved=0ahUKEwjI1JDn0Kf0AhV4rVYBHa0iB1kQ4dUDCAc&uact=5&oq=How%20to%20make%20bakery%3F&gs_lcp=Cgdnd3Mtd2l6EAMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBNQAFgAYJMDaABwAHgAgAF-iAF-kgEDMC4xmAEAoAECoAEB&sclient=gws-wiz')

questions, answers = [], []
while 1:
    for idx,question in enumerate(driver.find_elements(By.CSS_SELECTOR, "div[id*='RELATED_QUESTION']")):
        if idx >= len(questions): # skip already parsed questions
            questions.append(question.text)
            txt = ''
            for answer in question.find_elements(By.CSS_SELECTOR, "div[id*='WEB_ANSWERS_RESULT']"):
                txt += answer.get_attribute('innerText')
            answers.append(txt)
    inp = input(f'{idx+1} questions parsed, continue? (y/n)')
    if inp == 'y':
        question.click()
        time.sleep(2)
    else:
        break

The following code parse the questions appearing on screen, then asks if you want to parse more questions or not. If you enter y then it clicks on the last question's button so that more are loaded in the page. The questions are stored in the list questions, the answers in the list answers

import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.service import Service

your_path = '...'
driver = webdriver.Chrome(service=Service(your_path))

driver.get('https://www.google.com/search?q=How%20to%20make%20bakery%3F&source=hp&ei=j0aZYYjRAvja2roPrcWcyAU&iflsig=ALs-wAMAAAAAYZlUn4NMUPjfIpQmrXSmjIDnaWjJXWIJ&ved=0ahUKEwjI1JDn0Kf0AhV4rVYBHa0iB1kQ4dUDCAc&uact=5&oq=How%20to%20make%20bakery%3F&gs_lcp=Cgdnd3Mtd2l6EAMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBMyBAgAEBNQAFgAYJMDaABwAHgAgAF-iAF-kgEDMC4xmAEAoAECoAEB&sclient=gws-wiz')

questions, answers = [], []
while 1:
    for idx,question in enumerate(driver.find_elements(By.CSS_SELECTOR, "div[id*='RELATED_QUESTION']")):
        if idx >= len(questions): # skip already parsed questions
            questions.append(question.text)
            txt = ''
            for answer in question.find_elements(By.CSS_SELECTOR, "div[id*='WEB_ANSWERS_RESULT']"):
                txt += answer.get_attribute('innerText')
            answers.append(txt)
    inp = input(f'{idx+1} questions parsed, continue? (y/n)')
    if inp == 'y':
        question.click()
        time.sleep(2)
    else:
        break
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文