如何刮擦跨度同级跨度的文本?
您好,我正在尝试学习如何进行网络刮擦,因此我首先尝试网络刮擦我的学校菜单。
我遇到了一个问题,如果我无法将菜单项放在跨度类中,而是将单词在跨度类“显示”的同一行中获取。
from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Chrome(executable_path=chromedriver.exe')#changed this
driver.get('https://housing.ucdavis.edu/dining/menus/dining-commons/tercero/')
results = []
content = driver.page_source
soups = BeautifulSoup(content, 'html.parser')
element=soups.findAll('span',class_ = 'collapsible-heading-status')
for span in element:
print(span.text)
我试图将其纳入Span.span.text,但这不会返回我任何东西,所以有人可以给我一些指针,以指示如何在Coldapsible头上提取信息 -状态类。
Hello I'm trying to learn how to web scrape so I started by trying to web scrape my school menu.
Ive come into a problem were I can't get the menu items under a span class but instead get the the word within the same line of the span class "show".
here is a short amount of the html text I am trying to work with
from bs4 import BeautifulSoup
from selenium import webdriver
driver = webdriver.Chrome(executable_path=chromedriver.exe')#changed this
driver.get('https://housing.ucdavis.edu/dining/menus/dining-commons/tercero/')
results = []
content = driver.page_source
soups = BeautifulSoup(content, 'html.parser')
element=soups.findAll('span',class_ = 'collapsible-heading-status')
for span in element:
print(span.text)
I have tried to make it into span.span.text but that wouldn't return me anything so can some one give me some pointer on how to extract the info under the collapsible-heading-status class.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
美味的华夫饼 - 如前所述,他们已经消失了,但是要获得您的目标,方法是通过
css selectors
使用使用
相邻的同胞组合组合
:或使用
。 find_next_sibling()
:示例
以结构化的方式获取每个信息的
:输出
Yummy waffles - As mentioned they are gone, but to get your goal an approach would be to select the names via
css selectors
using theadjacent sibling combinator
:or with
find_next_sibling()
:Example
To get the whole information for each in a structured way you could use:
Output