如何使用Selenium Python从表中捕获数据？

发布于 2025-01-25 03:40:04 字数 1652 浏览 3 评论 0原文

我需要从链接中捕获表：

https://fr.tradingeconomics.com/country-list/rating

我尝试了以下代码，但我没有得到任何响应，

from selenium import webdriver
from selenium.webdriver.chrome.service import Service as ChromeService
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains
import time
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(service=ChromeService(ChromeDriverManager().install()))
my_url= "https://fr.tradingeconomics.com/country-list/rating"
driver.get(my_url)
#actions = ActionChains(driver)

WebDriverWait(driver, 50).until(EC.presence_of_all_elements_located((By.CLASS_NAME, "table table-hover")))
trs = driver.find_elements(By.TAG_NAME, "tr")
print(len(trs))
countries = []
for tr in trs:
    country = {}
    items= tr.find_elements(By.TAG_NAME, "td")
    for item in items:
        country_name = item.find_element(By.XPATH, "//*[@id='ctl00_ContentPlaceHolder1_ctl01_GridView1']/tbody/tr[2]/td[1]")
        country['country_name'] = country_name.get_attribute('text')
        s_and_p = item.find_element(By.XPATH, "//*[@id='ctl00_ContentPlaceHolder1_ctl01_GridView1']/tbody/tr[2]/td[2]")
        country['S&P']= s_and_p.get_attribute("text")
        moodys = item.find_element(By.XPATH, "//*[@id='ctl00_ContentPlaceHolder1_ctl01_GridView1']/tbody/tr[2]/td[3]")
        country['Moody\'s'] = moodys.get_attribute("text")

    countries.append(country)
    print(country)

任何帮助将不胜感激。谢谢。

原文

I need to capture the table from the link:

https://fr.tradingeconomics.com/country-list/rating

I tried the following code but I don't get any response

from selenium import webdriver
from selenium.webdriver.chrome.service import Service as ChromeService
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.action_chains import ActionChains
import time
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(service=ChromeService(ChromeDriverManager().install()))
my_url= "https://fr.tradingeconomics.com/country-list/rating"
driver.get(my_url)
#actions = ActionChains(driver)

WebDriverWait(driver, 50).until(EC.presence_of_all_elements_located((By.CLASS_NAME, "table table-hover")))
trs = driver.find_elements(By.TAG_NAME, "tr")
print(len(trs))
countries = []
for tr in trs:
    country = {}
    items= tr.find_elements(By.TAG_NAME, "td")
    for item in items:
        country_name = item.find_element(By.XPATH, "//*[@id='ctl00_ContentPlaceHolder1_ctl01_GridView1']/tbody/tr[2]/td[1]")
        country['country_name'] = country_name.get_attribute('text')
        s_and_p = item.find_element(By.XPATH, "//*[@id='ctl00_ContentPlaceHolder1_ctl01_GridView1']/tbody/tr[2]/td[2]")
        country['S&P']= s_and_p.get_attribute("text")
        moodys = item.find_element(By.XPATH, "//*[@id='ctl00_ContentPlaceHolder1_ctl01_GridView1']/tbody/tr[2]/td[3]")
        country['Moody\'s'] = moodys.get_attribute("text")

    countries.append(country)
    print(country)

Any help would be appreciated. Thank you.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

朕就是辣么酷 2025-02-01 03:40:04

由于URL不是动态的，因此您还可以仅使用Pandas轻松地获取表数据。

import pandas as pd 
url='https://fr.tradingeconomics.com/country-list/rating'
df = pd.read_html(url)[0]
print(df)

输出：

      Unnamed: 0   S&P Moody's Fitch       DBRS    TE
0        Albanie    B+      B1   NaN        NaN  35.0
1        Andorre   BBB    Baa2  BBB+        NaN  62.0
2         Angola    B-      B3    B-        NaN  23.0
3      Argentine  CCC+      Ca   CCC        CCC  15.0
4        Arménie    B+     Ba3    B+        NaN  14.0
..           ...   ...     ...   ...        ...   ...
151      Uruguay   BBB    Baa2  BBB-  BBB (low)  55.0
152  Ouzbékistan   BB-      B1   BB-        NaN  38.0
153    Venezuela   NaN       C    RD        NaN  11.0
154      Vietnam    BB     Ba3    BB        NaN  43.0
155       Zambie    SD      Ca    RD        NaN  30.0

[156 rows x 6 columns]

As the url isn't dynamic, so you also can easily grab table data using pandas only.

import pandas as pd 
url='https://fr.tradingeconomics.com/country-list/rating'
df = pd.read_html(url)[0]
print(df)

Output:

      Unnamed: 0   S&P Moody's Fitch       DBRS    TE
0        Albanie    B+      B1   NaN        NaN  35.0
1        Andorre   BBB    Baa2  BBB+        NaN  62.0
2         Angola    B-      B3    B-        NaN  23.0
3      Argentine  CCC+      Ca   CCC        CCC  15.0
4        Arménie    B+     Ba3    B+        NaN  14.0
..           ...   ...     ...   ...        ...   ...
151      Uruguay   BBB    Baa2  BBB-  BBB (low)  55.0
152  Ouzbékistan   BB-      B1   BB-        NaN  38.0
153    Venezuela   NaN       C    RD        NaN  11.0
154      Vietnam    BB     Ba3    BB        NaN  43.0
155       Zambie    SD      Ca    RD        NaN  30.0

[156 rows x 6 columns]

回复收藏 0 原文

鸵鸟症 2025-02-01 03:40:04

您必须使用innertext不是text，也是第一个tr没有td，这是您不是得到任何回应。

硒解决方案：

代码：

driver.maximize_window()
wait = WebDriverWait(driver, 30)

my_url= "https://fr.tradingeconomics.com/country-list/rating"
driver.get(my_url)
#actions = ActionChains(driver)

table = WebDriverWait(driver, 50).until(EC.visibility_of_element_located((By.XPATH, "//table[@class='table table-hover']")))
trs = table.find_elements(By.XPATH, ".//tr")
print(len(trs))
countries = []
for tr in trs:
    tds = tr.find_elements(By.XPATH, ".//td[not(self::th)]")
    for td in tds:
        print(td.get_attribute('innerText'))

进口：

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

You have to use innerText not text, also the first tr does not have td that's the reason you are not getting anything in response.

Selenium solution:

Code:

driver.maximize_window()
wait = WebDriverWait(driver, 30)

my_url= "https://fr.tradingeconomics.com/country-list/rating"
driver.get(my_url)
#actions = ActionChains(driver)

table = WebDriverWait(driver, 50).until(EC.visibility_of_element_located((By.XPATH, "//table[@class='table table-hover']")))
trs = table.find_elements(By.XPATH, ".//tr")
print(len(trs))
countries = []
for tr in trs:
    tds = tr.find_elements(By.XPATH, ".//td[not(self::th)]")
    for td in tds:
        print(td.get_attribute('innerText'))

Imports:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

回复收藏 0 原文

~没有更多了~