未能在具有多链接的网站中使用Webccrape标题和作者
我正在尝试将此 link 。例如,我只想刮擦第一页。我想为您在第一页中找到的10个链接中的每个链接中收集标题和作者。
为了收集标题和作者,我编写了以下代码行:
from bs4 import BeautifulSoup
import requests
import numpy as np
url = 'https://www.bis.org/cbspeeches/index.htm?m=1123'
r = BeautifulSoup(requests.get(url).content, features = "lxml")
r.select('#cbspeeches_list a') # '#cbspeeches_list a' got via SelectorGadget
但是,我得到一个空列表。我在做什么错?
谢谢!
I am trying to webscrape this link. As an example, I just want to scrape the first page. I would like to collect titles and authors for each of the 10 link you find in the first page.
To gather titles and authors, I write the following line of code:
from bs4 import BeautifulSoup
import requests
import numpy as np
url = 'https://www.bis.org/cbspeeches/index.htm?m=1123'
r = BeautifulSoup(requests.get(url).content, features = "lxml")
r.select('#cbspeeches_list a') # '#cbspeeches_list a' got via SelectorGadget
However, I get an empty list. What am I doing wrong?
Thanks!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
数据以API作为Post方法从外部源加载。只有您必须使用API URL。
输出
Data is loaded from external source by API as post method. Just you have to use the API url.
Output
尝试以下操作:
打印出来:
Try this:
Prints out: