解析维基百科引发 KeyError:查询(Python)
我正在尝试获取超过 200 页的维基百科反向链接。为此,我:
- 查找意大利语 URL,如果不起作用,我查找英语 URL
- 将它们放入列表中
- 迭代此列表以获取它们在维基百科上可用的语言数量(使用 bs4)
- I将这些语言附加到列表中
- 我迭代两种语言和网址以获取页面标题和反向链接以放入字典中,其中键为语言,值为该语言中可用的反向链接数量
但我收到错误“查询”。我不知道为什么
opere = df.label #works
listaurl = []
for x in opere:
try:
wiki_wiki = wikipediaapi.Wikipedia('it')
p = wiki_wiki.page(x).fullurl
listaurl.append(p)
print(p)
except:
wiki_wiki = wikipediaapi.Wikipedia('en')
p = wiki_wiki.page(x).fullurl
listaurl.append(p)
print(p)
lista = []
for url in listaurl:
soup = BeautifulSoup(urllib.request.urlopen(url))
links = [(el.get('lang'), el.get('href')) for el in soup.select('li.interlanguage-link > a')]
for language, link in links:
lista.append(language)
testo = soup.title.text.replace(" ", "")
import wikipediaapi
lista2 = []
regex = r"(?<=/wiki/).*$"
dik = {}
for lang in lista:
wikis = wikipediaapi.Wikipedia(lang)
for apage in listaurl:
wikipage = apage.split('/wiki/')[1]
page_py = wikis.page(wikipage)
print(page_py)
titles = page_py.title
print(titles)
back = page_py.backlinks
dik[lang] = len(back)
要重现的示例输入(df):
item,label,authorlabel,authorlabel2,numWikipediaLanguages
http://www.wikidata.org/entity/Q172850,Il nome della rosa,,Umberto Eco,53
http://www.wikidata.org/entity/Q437791,Il pendolo di Foucault,,Umberto Eco,30
http://www.wikidata.org/entity/Q791487,Baudolino,,Umberto Eco,26
错误回溯:
Traceback (most recent call last):
File "C:....myfile.py", line 43, in <module>
back = page_py.backlinks
File "C:\....\wikipediaapi\__init__.py", line 1112, in backlinks
self._fetch('backlinks')
File "C:....\wikipediaapi\__init__.py", line 1148, in _fetch
getattr(self.wiki, call)(self)
File "C:....wikipediaapi\__init__.py", line 468, in backlinks
self._common_attributes(raw['query'], page)
KeyError: 'query'
I'm trying to get wikipedia backlinks of more than 200 pages. To do this, I:
- look for URLs in italian, if it doesn't work I look for them in English
- put them in a list
- iterate over this list to get the number of languages they are available in on Wikipedia (with bs4)
- I append these languages in a list
- I iterate over both languages and urls to get page titles and backlinks to put in a dictinonary with key the language and value the number of backlinks available in that language
But I get the error "query". I don't know why
opere = df.label #works
listaurl = []
for x in opere:
try:
wiki_wiki = wikipediaapi.Wikipedia('it')
p = wiki_wiki.page(x).fullurl
listaurl.append(p)
print(p)
except:
wiki_wiki = wikipediaapi.Wikipedia('en')
p = wiki_wiki.page(x).fullurl
listaurl.append(p)
print(p)
lista = []
for url in listaurl:
soup = BeautifulSoup(urllib.request.urlopen(url))
links = [(el.get('lang'), el.get('href')) for el in soup.select('li.interlanguage-link > a')]
for language, link in links:
lista.append(language)
testo = soup.title.text.replace(" ", "")
import wikipediaapi
lista2 = []
regex = r"(?<=/wiki/).*quot;
dik = {}
for lang in lista:
wikis = wikipediaapi.Wikipedia(lang)
for apage in listaurl:
wikipage = apage.split('/wiki/')[1]
page_py = wikis.page(wikipage)
print(page_py)
titles = page_py.title
print(titles)
back = page_py.backlinks
dik[lang] = len(back)
Example input to reproduce (the df):
item,label,authorlabel,authorlabel2,numWikipediaLanguages
http://www.wikidata.org/entity/Q172850,Il nome della rosa,,Umberto Eco,53
http://www.wikidata.org/entity/Q437791,Il pendolo di Foucault,,Umberto Eco,30
http://www.wikidata.org/entity/Q791487,Baudolino,,Umberto Eco,26
Error traceback:
Traceback (most recent call last):
File "C:....myfile.py", line 43, in <module>
back = page_py.backlinks
File "C:\....\wikipediaapi\__init__.py", line 1112, in backlinks
self._fetch('backlinks')
File "C:....\wikipediaapi\__init__.py", line 1148, in _fetch
getattr(self.wiki, call)(self)
File "C:....wikipediaapi\__init__.py", line 468, in backlinks
self._common_attributes(raw['query'], page)
KeyError: 'query'
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论