使用Python从网站保存文本文件

发布于 2025-02-09 15:06:38 字数 999 浏览 1 评论 0原文

使用Python,我的任务是简单地从此站点中获取html源代码 - https ://www.cboe.com/us/equities/market_statistics/corporate_action/ - 并将第一个文本文件保存在名为“ corporate_action_rpt_rpt_20220621.txt”的表中 单击此处以获取图像 目前,我能够使用BeautifulSoup读取此HTML线,如以下网站的源代码所示:

<a href="2022/06/bzx_equities_corporate_action_rpt_20220621.txt-dl">corporate_action_rpt_20220621.txt</a>

这是我使用的代码:

import requests
from bs4 import BeautifulSoup
import os
URL = "https://www.cboe.com/us/equities/market_statistics/corporate_action/"
r = requests.get(URL)
soup = BeautifulSoup(r.content, 'html5lib')
table = soup.find('table')
textFileRow = table.tbody.find('tr').find('td').find('a')
print(textFileRow)

如何使用Python打开并保存文本文件?

Using Python, my task is to simply take in the html source code from this site - https://www.cboe.com/us/equities/market_statistics/corporate_action/ - and save the first text file in the table named "corporate_action_rpt_20220621.txt"
click here for image
Right now, I'm able to read this html line, using BeautifulSoup, as shown below from the site's source code:

<a href="2022/06/bzx_equities_corporate_action_rpt_20220621.txt-dl">corporate_action_rpt_20220621.txt</a>

Here is the code I used:

import requests
from bs4 import BeautifulSoup
import os
URL = "https://www.cboe.com/us/equities/market_statistics/corporate_action/"
r = requests.get(URL)
soup = BeautifulSoup(r.content, 'html5lib')
table = soup.find('table')
textFileRow = table.tbody.find('tr').find('td').find('a')
print(textFileRow)

How would I open and save the text file from here using Python?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

無心 2025-02-16 15:06:40

您必须使用您检索到的A标签的HREF中的URL获取文件,例如:

import requests
from bs4 import BeautifulSoup
import os
URL = "https://www.cboe.com/us/equities/market_statistics/corporate_action/"
r = requests.get(URL)
soup = BeautifulSoup(r.content, 'html5lib')
table = soup.find('table')
textFileRow = table.tbody.find('tr').find('td').find('a')

r = requests.get(URL + textFileRow['href'])
r.encoding = 'utf-8'
with open("textFile.txt", "w") as text_file:
    text_file.write(r.text)

You have to fetch the file using the URL in the href of the a tag you have retrieved, like so:

import requests
from bs4 import BeautifulSoup
import os
URL = "https://www.cboe.com/us/equities/market_statistics/corporate_action/"
r = requests.get(URL)
soup = BeautifulSoup(r.content, 'html5lib')
table = soup.find('table')
textFileRow = table.tbody.find('tr').find('td').find('a')

r = requests.get(URL + textFileRow['href'])
r.encoding = 'utf-8'
with open("textFile.txt", "w") as text_file:
    text_file.write(r.text)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文