Wikimedia API-如何确定页面属于哪个门户?

发布于 2025-01-29 20:42:27 字数 656 浏览 2 评论 0 原文

我想确定给定的Wikipedia页面是否使用MediaWiki API属于某个Wikipedia门户。到目前为止,我一直在尝试 page properties> page properties 我似乎无法找到一种方法来得出给定页面所属的门户网站。

例如,在Wikipedia页面上的蛋糕在页面的底部,我可以按蛋糕的部分上show ,并显示了一些指向不同蛋糕页面的链接。在那里,我还可以看到所有这些都属于食品门户。我希望使用MediaWiki API从给定页面提取该信息。

I wish to determine whether a given Wikipedia page belongs to a certain Wikipedia Portal using the MediaWiki API. So far, I have been experimenting with the page properties of the API but I cannot seem to find a way to derive what Portal a given page belongs to.

As an example, on the Wikipedia page for Cake in the very bottom of the page, I can press Show on the section Cakes, and a bunch of links to different cake pages show up. There I can also see that all of these belong to the Food portal. It is that information that I would wish to extract from a given page using the MediaWiki API.

enter image description here

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

海夕 2025-02-05 20:42:27

据我所知,维基百科实际上没有对“门户财产”的正式定义。与是MediaWiki软件一部分的类别相反,门户网站是Wikipedia的自定义页面,旨在使探索主题更容易。

不过,您可以使用启发式方法,而不是正式的定义,并根据其中一个链接到另一个,确定页面和某些门户网站之间的连接。这两者都有API端点:(

注意: 100 是'portal` namespace的id ),

哪个门户网站页面从页面“ cake”或“ pizza”

format=json&Amp; prop = links& “ pizza”

https://en.wikipedia.org/w/api.php?action = query& format = json& prop=linkshere&pprop=linkshere&titles=cake%7cpizza&
(尽管您可以看到,许多无关的门户网站链接到“蛋糕”,而没有链接到“披萨”)

两个方向的合并查询

As far as I know, there is actually no formal definition of "belongings to a portal" in Wikipedia. Opposed to categories which are part of the MediaWiki software, portals are custom pages for Wikipedia that are aimed to make it easier to explore a topic.

Instead of a formal definition though, you can use an heuristic and determine the connection between the page and some portal based on one of them linking to the other. There are API endpoints for both:

(Note: 100 is the id of the 'Portal` namespace)

Which portal pages are linked from the page "Cake" or "Pizza"

https://en.wikipedia.org/w/api.php?action=query&format=json&prop=links&titles=Cake%7CPizza&plnamespace=100

Which portal pages link to the page "Cake" or "Pizza"

https://en.wikipedia.org/w/api.php?action=query&format=json&prop=linkshere&titles=Cake%7CPizza&lhnamespace=100
(though as you can see, many unrelated portals link to "Cake" and none link to "Pizza")

A combined query for both directions

https://en.wikipedia.org/w/api.php?action=query&format=json&prop=links%7Clinkshere&titles=Cake%7CPizza&plnamespace=100&lhnamespace=100

雨落星ぅ辰 2025-02-05 20:42:27

因此,其他一些调查我找到了答案:

我最终使用修订版 API。这使我能够提供一系列我想研究的页面标题,并以JSON格式返回给我的每个页面的HTML。然后,我可以只搜索包含 Portal 的行,并找出该页面所属的门户(如果有)。

如果有人处于类似情况,这是对API的示例查询:

https://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles =面包| bubble_tea | pizza&格式= json& redirects& rvprop = content& rvslots = main

So trough some more investigation i found the answer:

I ended up using the Revisions property in the API. This allows me to to give a series of page titles that I want to investigate, and have the HTML of each page returned to me in json format. Then I can just search for lines containing Portal and figure out what portal (if any) the page belongs to.

If anyone are in a similar situation, here is an example query to the API:

https://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Bread|Bubble_tea|Pizza&format=json&redirects&rvprop=content&rvslots=main

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文