Python无法从股票交易报告中获取表格

发布于 2025-01-12 01:08:40 字数 468 浏览 0 评论 0原文

我的代码:

import time    
import requests
import pandas as pd
from bs4 import BeautifulSoup

URL = "https://www.hkex.com.hk/eng/stat/dmstat/dayrpt/hsitmc220303.htm"

req = requests.get(URL)    
page = BeautifulSoup(req.content, 'html.parser')    
table = page.find_all('pre')    
df = pd.read_html(str(table), displayed_only=False)[0]    
print(df)

错误消息:

ValueError: No tables found

我想将表获取到数据帧。有什么建议吗?

My code:

import time    
import requests
import pandas as pd
from bs4 import BeautifulSoup

URL = "https://www.hkex.com.hk/eng/stat/dmstat/dayrpt/hsitmc220303.htm"

req = requests.get(URL)    
page = BeautifulSoup(req.content, 'html.parser')    
table = page.find_all('pre')    
df = pd.read_html(str(table), displayed_only=False)[0]    
print(df)

Error message:

ValueError: No tables found

I want to get the table to dataframe. Any suggestions?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

柠栀 2025-01-19 01:08:40

这应该有效:

import requests
import pandas as pd

url = 'https://www.hkex.com.hk/eng/stat/dmstat/dayrpt/hsitmc220303.htm'

payload = {
'LangCode': 'en',
'TDD': '1',
'TMM': '11',
'TYYYY': '2019'}

jsonData = requests.get(url, params=payload).json()

final_df = pd.DataFrame()
for row in jsonData['data']:
    #row = jsonData['data'][1]

    data_row = []
    for idx, colspan in enumerate(row['colspan']):
        colspan_int = int(colspan[0])
        data_row.append(row['td'][idx] * colspan_int)
        flat_list = [item for sublist in data_row for item in sublist]
    temp_row = pd.DataFrame([flat_list])
    final_df = final_df.append(temp_row, sort=True).reset_index(drop=True)


df = final_df[final_df[0].str.contains(r'Total market 
capitalisation(?!$)')].iloc[:,:2]
df['date'] = date
df.to_csv('file.csv', index=False)

This should work :

import requests
import pandas as pd

url = 'https://www.hkex.com.hk/eng/stat/dmstat/dayrpt/hsitmc220303.htm'

payload = {
'LangCode': 'en',
'TDD': '1',
'TMM': '11',
'TYYYY': '2019'}

jsonData = requests.get(url, params=payload).json()

final_df = pd.DataFrame()
for row in jsonData['data']:
    #row = jsonData['data'][1]

    data_row = []
    for idx, colspan in enumerate(row['colspan']):
        colspan_int = int(colspan[0])
        data_row.append(row['td'][idx] * colspan_int)
        flat_list = [item for sublist in data_row for item in sublist]
    temp_row = pd.DataFrame([flat_list])
    final_df = final_df.append(temp_row, sort=True).reset_index(drop=True)


df = final_df[final_df[0].str.contains(r'Total market 
capitalisation(?!$)')].iloc[:,:2]
df['date'] = date
df.to_csv('file.csv', index=False)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文