使用Python从网站保存文本文件

发布于 2025-02-09 15:06:38 字数 999 浏览 1 评论 0原文

使用Python，我的任务是简单地从此站点中获取html源代码 - https ：//www.cboe.com/us/equities/market_statistics/corporate_action/ - 并将第一个文本文件保存在名为“ corporate_action_rpt_rpt_20220621.txt”的表中单击此处以获取图像目前，我能够使用BeautifulSoup读取此HTML线，如以下网站的源代码所示：

<a href="2022/06/bzx_equities_corporate_action_rpt_20220621.txt-dl">corporate_action_rpt_20220621.txt</a>

这是我使用的代码：

import requests
from bs4 import BeautifulSoup
import os
URL = "https://www.cboe.com/us/equities/market_statistics/corporate_action/"
r = requests.get(URL)
soup = BeautifulSoup(r.content, 'html5lib')
table = soup.find('table')
textFileRow = table.tbody.find('tr').find('td').find('a')
print(textFileRow)

如何使用Python打开并保存文本文件？

原文

Using Python, my task is to simply take in the html source code from this site - https://www.cboe.com/us/equities/market_statistics/corporate_action/ - and save the first text file in the table named "corporate_action_rpt_20220621.txt"
click here for image
Right now, I'm able to read this html line, using BeautifulSoup, as shown below from the site's source code:

<a href="2022/06/bzx_equities_corporate_action_rpt_20220621.txt-dl">corporate_action_rpt_20220621.txt</a>

Here is the code I used:

import requests
from bs4 import BeautifulSoup
import os
URL = "https://www.cboe.com/us/equities/market_statistics/corporate_action/"
r = requests.get(URL)
soup = BeautifulSoup(r.content, 'html5lib')
table = soup.find('table')
textFileRow = table.tbody.find('tr').find('td').find('a')
print(textFileRow)

How would I open and save the text file from here using Python?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

無心 2025-02-16 15:06:40

您必须使用您检索到的A标签的HREF中的URL获取文件，例如：

import requests
from bs4 import BeautifulSoup
import os
URL = "https://www.cboe.com/us/equities/market_statistics/corporate_action/"
r = requests.get(URL)
soup = BeautifulSoup(r.content, 'html5lib')
table = soup.find('table')
textFileRow = table.tbody.find('tr').find('td').find('a')

r = requests.get(URL + textFileRow['href'])
r.encoding = 'utf-8'
with open("textFile.txt", "w") as text_file:
    text_file.write(r.text)

You have to fetch the file using the URL in the href of the a tag you have retrieved, like so:

import requests
from bs4 import BeautifulSoup
import os
URL = "https://www.cboe.com/us/equities/market_statistics/corporate_action/"
r = requests.get(URL)
soup = BeautifulSoup(r.content, 'html5lib')
table = soup.find('table')
textFileRow = table.tbody.find('tr').find('td').find('a')

r = requests.get(URL + textFileRow['href'])
r.encoding = 'utf-8'
with open("textFile.txt", "w") as text_file:
    text_file.write(r.text)

回复收藏 0 原文

~没有更多了~