取消数据不是来自精确URL的数据

发布于 2025-01-31 10:33:05 字数 1218 浏览 2 评论 0原文

我正在尝试从Rswiki刮一些怪物Infobox桌子。

一些特定的怪物具有多个级别，例如：

您可以通过单击Infobox顶部的框来切换不同的级别：“级别7”，“级别10” ...

单击级别框后，它会更改URL以匹配级别。

因此，当我要求url https://oldschool.runeschool.runescape.wiki/wiki/w/warf#level_10 < /a>，它仅从第一级带来数据，以防万一： htttps：htttps：// oldschool。 runescape.wiki/w/dwarf#level_7 ，我无法取消其他级别。

import requests
from bs4 import BeautifulSoup

url = 'https://oldschool.runescape.wiki/w/Dwarf#Level_20'
response = requests.get(url, headers = {'User-Agent':'Mozilla/5.0'})
soup = BeautifulSoup(response.content, 'html.parser')
soup_minfobox = soup.find_all('table', class_ ="infobox infobox-switch no-parenthesis-style infobox-monster")

print(soup_minfobox[0].text)

输出：7级10级11level 20dwarfreyleast 6 2001年4月6日（更新）会员nocombat level7size1x1 ...

对不起，我的临时代码，但是在输出中，您可以看到它是最终的LV 7数据，尽管URL适用于LV 20。

原文

I'm trying to scrap some monster infobox table from rswiki.

Some specific monster have multiple levels, for example:

https://oldschool.runescape.wiki/w/Dwarf

You can switch through the different levels by clicking on boxes on top of the infobox: "Level 7","Level 10"...

Once you click on the level box it changes the url to match the level.

So when i request the url https://oldschool.runescape.wiki/w/Dwarf#Level_10, it's bringing data from the first level only, in case: https://oldschool.runescape.wiki/w/Dwarf#Level_7, and i can't get to scrap other levels.

import requests
from bs4 import BeautifulSoup

url = 'https://oldschool.runescape.wiki/w/Dwarf#Level_20'
response = requests.get(url, headers = {'User-Agent':'Mozilla/5.0'})
soup = BeautifulSoup(response.content, 'html.parser')
soup_minfobox = soup.find_all('table', class_ ="infobox infobox-switch no-parenthesis-style infobox-monster")

print(soup_minfobox[0].text)

Output: Level 7Level 10Level 11Level 20DwarfReleased6 April 2001 (Update)MembersNoCombat level7Size1x1 ...

Excuse me the makeshift code, but in the output you can see that it is the data from the lv 7 in the end, although the url is for the lv 20.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

亢潮 2025-02-07 10:33:05

如果您手动触发事件（从浏览器的控制台），您会发现Infobox会发生变化：

$("span[data-switch-anchor='#Level_7']").click();
$("span[data-switch-anchor='#Level_10']").click();
$("span[data-switch-anchor='#Level_11']").click();
$("span[data-switch-anchor='#Level_20']").click();

因此，您可以使用上述选择器并咨询以下主题中有关如何使用Beautifutsoup调用事件的答案：

与BeautifulSoup python 一起调用OnClick event

If you manually trigger the events (from the browser's console), you'll see that the infobox changes:

$("span[data-switch-anchor='#Level_7']").click();
$("span[data-switch-anchor='#Level_10']").click();
$("span[data-switch-anchor='#Level_11']").click();
$("span[data-switch-anchor='#Level_20']").click();

So you can use the above selectors and consult the answers provided in the following topic on how to invoke an event using BeautifulSoup:

invoking onclick event with beautifulsoup python

回复收藏 0 原文

~没有更多了~