取消数据不是来自精确URL的数据
我正在尝试从Rswiki刮一些怪物Infobox桌子。
一些特定的怪物具有多个级别,例如:
您可以通过单击Infobox顶部的框来切换不同的级别:“级别7”,“级别10” ...
单击级别框后,它会更改URL以匹配级别。
因此,当我要求url https://oldschool.runeschool.runescape.wiki/wiki/w/warf#level_10 < /a>,它仅从第一级带来数据,以防万一: htttps:htttps:// oldschool。 runescape.wiki/w/dwarf#level_7 ,我无法取消其他级别。
import requests
from bs4 import BeautifulSoup
url = 'https://oldschool.runescape.wiki/w/Dwarf#Level_20'
response = requests.get(url, headers = {'User-Agent':'Mozilla/5.0'})
soup = BeautifulSoup(response.content, 'html.parser')
soup_minfobox = soup.find_all('table', class_ ="infobox infobox-switch no-parenthesis-style infobox-monster")
print(soup_minfobox[0].text)
输出:7级10级11level 20dwarfreyleast 6 2001年4月6日(更新)会员nocombat level7size1x1 ...
对不起,我的临时代码,但是在输出中,您可以看到它是最终的LV 7数据,尽管URL适用于LV 20。
I'm trying to scrap some monster infobox table from rswiki.
Some specific monster have multiple levels, for example:
https://oldschool.runescape.wiki/w/Dwarf
You can switch through the different levels by clicking on boxes on top of the infobox: "Level 7","Level 10"...
Once you click on the level box it changes the url to match the level.
So when i request the url https://oldschool.runescape.wiki/w/Dwarf#Level_10, it's bringing data from the first level only, in case: https://oldschool.runescape.wiki/w/Dwarf#Level_7, and i can't get to scrap other levels.
import requests
from bs4 import BeautifulSoup
url = 'https://oldschool.runescape.wiki/w/Dwarf#Level_20'
response = requests.get(url, headers = {'User-Agent':'Mozilla/5.0'})
soup = BeautifulSoup(response.content, 'html.parser')
soup_minfobox = soup.find_all('table', class_ ="infobox infobox-switch no-parenthesis-style infobox-monster")
print(soup_minfobox[0].text)
Output: Level 7Level 10Level 11Level 20DwarfReleased6 April 2001 (Update)MembersNoCombat level7Size1x1 ...
Excuse me the makeshift code, but in the output you can see that it is the data from the lv 7 in the end, although the url is for the lv 20.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
如果您手动触发事件(从浏览器的控制台),您会发现Infobox会发生变化:
因此,您可以使用上述选择器并咨询以下主题中有关如何使用Beautifutsoup调用事件的答案:
与BeautifulSoup python 一起调用OnClick event
If you manually trigger the events (from the browser's console), you'll see that the infobox changes:
So you can use the above selectors and consult the answers provided in the following topic on how to invoke an event using BeautifulSoup:
invoking onclick event with beautifulsoup python