使用Selenium网络刮擦后,我在CSV文件中得到了奇怪的结果。

发布于 2025-02-02 11:26:10 字数 2231 浏览 2 评论 0原文

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By

import time
import csv


driver = webdriver.Chrome('/Users/myname/Desktop/web_crawling/chromedriver')

driver.get('https://www.naver.com')
time.sleep(2)


driver.find_element(by=By.CSS_SELECTOR, value='a.nav.shop').click()

search = driver.find_element(by=By.CSS_SELECTOR,value='._searchInput_search_input_QXUFf')
search.click()

search.send_keys("아이폰 13")
search.send_keys(Keys.ENTER)

before_h = driver.execute_script("return window.scrollY")

while True:
    driver.find_element(by=By.CSS_SELECTOR, value='body').send_keys(Keys.END)
    time.sleep(1)

    after_h = driver.execute_script("return window.scrollY")

    if after_h == before_h:
        break
    before_h = after_h

#create csv file
f = open(r"/Users/yungijeong/Desktop/web_crawling/data.csv", 'w', encoding='UTF8')
csvWriter = csv.writer(f)

items = driver.find_elements(by=By.CSS_SELECTOR, value=".basicList_info_area__17Xyo")

for item in items:
    names = item.find_elements(by=By.CSS_SELECTOR,  value=".basicList_link__1MaTN")
    for name in names:
        print(name.text)

    try:
        prices = item.find_elements(by=By.CSS_SELECTOR, value=".price_num__2WUXn")
        for price in prices:
           print(price.text)
    except:
        print("판매중단")
    links = item.find_elements(by=By.CSS_SELECTOR, value=".basicList_title__3P9Q7 > a")
    for link in links:
        print(link.get_attribute('href'))
    print(name, price, link)

    #adding inside the csv files

    csvWriter.writerow([name, price, link])

f.close()

在这里,我试图在《古兰经》购物网站上网络刮擦iPhone的详细信息和价格。我制作了代码,以便Webdriver自动进入网站并获取所有详细信息(例如价格和产品的链接)。最后,它应该制作一个CSV文件并粘贴在其​​中刮擦的所有数据。

该代码运行完美,但是当我将其导出到CSV文件时,看起来像这样: csv中的结果

The contents seem like they weren't exported properly.. Every single code looks like a HTML code...Did any of you have the same issue?在终端中,它看起来像Webdrvier正确区分了数据,但是结果,它以怪异的方式导出了数据。如果您有同样的问题,请分享!!

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By

import time
import csv


driver = webdriver.Chrome('/Users/myname/Desktop/web_crawling/chromedriver')

driver.get('https://www.naver.com')
time.sleep(2)


driver.find_element(by=By.CSS_SELECTOR, value='a.nav.shop').click()

search = driver.find_element(by=By.CSS_SELECTOR,value='._searchInput_search_input_QXUFf')
search.click()

search.send_keys("아이폰 13")
search.send_keys(Keys.ENTER)

before_h = driver.execute_script("return window.scrollY")

while True:
    driver.find_element(by=By.CSS_SELECTOR, value='body').send_keys(Keys.END)
    time.sleep(1)

    after_h = driver.execute_script("return window.scrollY")

    if after_h == before_h:
        break
    before_h = after_h

#create csv file
f = open(r"/Users/yungijeong/Desktop/web_crawling/data.csv", 'w', encoding='UTF8')
csvWriter = csv.writer(f)

items = driver.find_elements(by=By.CSS_SELECTOR, value=".basicList_info_area__17Xyo")

for item in items:
    names = item.find_elements(by=By.CSS_SELECTOR,  value=".basicList_link__1MaTN")
    for name in names:
        print(name.text)

    try:
        prices = item.find_elements(by=By.CSS_SELECTOR, value=".price_num__2WUXn")
        for price in prices:
           print(price.text)
    except:
        print("판매중단")
    links = item.find_elements(by=By.CSS_SELECTOR, value=".basicList_title__3P9Q7 > a")
    for link in links:
        print(link.get_attribute('href'))
    print(name, price, link)

    #adding inside the csv files

    csvWriter.writerow([name, price, link])

f.close()

Here, I am trying to web scrape the details and prices of iphones in a Koran shopping website. I made my code so that the webdriver automatically goes into the site and get all the details (such as prices and link for the products). And in the end, it's supposed to make a csv file and paste all the data that is scraped inside there.

The code works perfectly, but when I export it to a csv file, it looks like this: The result in csv

The contents seem like they weren't exported properly.. Every single code looks like a HTML code...Did any of you have the same issue? in the terminal, it looks like the webdrvier correctly distinguished the data, but as a result, it exports it in a weird way. Please share if you had the same issue!!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

痴骨ら 2025-02-09 11:26:10

我认为所有问题是您print()值,但您没有分配给变量。

You have

print(name.text)
print(price.text)
print(link.get_attribute('href'))

but you forgot

name = name.text
price = price.text
link = link.get_attribute('href')

OR you should write

csvWriter.writerow([name.text, link.text, link.get_attribute('href')])

I think all problem is that you print() values but you don't assign to variables.

You have

print(name.text)
print(price.text)
print(link.get_attribute('href'))

but you forgot

name = name.text
price = price.text
link = link.get_attribute('href')

OR you should write

csvWriter.writerow([name.text, link.text, link.get_attribute('href')])
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文