如果存在维基百科 api,如何使用?

发布于 2024-07-23 05:30:15 字数 1436 浏览 8 评论 0原文

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(8

烟酉 2024-07-30 05:30:15

您确实需要花一些时间阅读文档,因为这花了我一些时间来查看并单击链接来修复它。 :/但出于同情,我将为您提供一个链接,也许您可​​以学习使用。

http ://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=New_York_Yankees&rvprop=timestamp|user|comment|content

这就是您想要获取的变量。 最好的办法是知道您将要访问的页面,并将维基百科链接部分替换为标题,即:

http: //en.wikipedia.org/wiki/New_York_Yankees [取wiki/后面的部分]

-->

http ://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=New_York_Yankees&rvprop=timestamp|user|comment|content

[将其放在 GET 的 title 变量中要求。

上面的 URL 可以通过调整来获得您想要或不想要的不同部分。 所以请阅读文档:)

You really really need to spend some time reading the documentation, as this took me a moment to look and click on the link to fix it. :/ but out of sympathy i'll provide you a link that maybe you can learn to use.

http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=New_York_Yankees&rvprop=timestamp|user|comment|content

That's the variabled you will be looking to get. Your best bet is to know the page you will be after and replace the Wikipedia link part into the title i.e.:

http://en.wikipedia.org/wiki/New_York_Yankees [Take the part after the wiki/]

-->

http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=New_York_Yankees&rvprop=timestamp|user|comment|content

[Place it in the title variable of the GET request.

The URL above can do with tweaking to get the different sections you do or do not want. So read the documentation :)

表情可笑 2024-07-30 05:30:15

这里的答案帮助我找到了解决方案,但我在这个过程中发现了更多信息,这可能对发现这个问题的其他人有好处。 我认为大多数人只是想使用 API 来快速获取页面上的内容。 以下是我的做法:

使用修订:

//working url:
http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Threadless&rvprop=content&format=json&rvsection=0&rvparse=1

//Explanation
//Base Url:
http://en.wikipedia.org/w/api.php?action=query

//tell it to get revisions:
&prop=revisions

//define page titles separated by pipes. In the example i used t-shirt company threadless
&titles=whatever|the|title|is

//specify that we want the page content
&rvprop=content

//I want my data in JSON, default is XML
&format=json

//lets you choose which section you want. 0 is the first one.
&rvsection=0

//tell wikipedia to parse it into html for you
&rvparse=1

使用提取(对于我正在做的事情来说更好/更容易)

//working url:
http://en.wikipedia.org/w/api.php?action=query&prop=extracts&titles=Threadless&format=json&exintro=1

//only explaining new parameters
//instead of revisions, we'll set prop=extracts
&prop=extracts

//if we just want the intro, we can use exintro. Otherwise it shows all sections
&exintro=1

所有信息都需要阅读前面提到的 API 文档,但我希望这些示例能够帮助大多数人谁来这里是为了快速解决问题。

The answers here helped me arrive at a solution, but I discovered more info in the process which may be of advantage to others who find this question. I figure most people simply want to use the API to quickly get content off the page. Here is how I'm doing that:

Using Revisions:

//working url:
http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Threadless&rvprop=content&format=json&rvsection=0&rvparse=1

//Explanation
//Base Url:
http://en.wikipedia.org/w/api.php?action=query

//tell it to get revisions:
&prop=revisions

//define page titles separated by pipes. In the example i used t-shirt company threadless
&titles=whatever|the|title|is

//specify that we want the page content
&rvprop=content

//I want my data in JSON, default is XML
&format=json

//lets you choose which section you want. 0 is the first one.
&rvsection=0

//tell wikipedia to parse it into html for you
&rvparse=1

Using Extracts (better/easier for what i'm doing)

//working url:
http://en.wikipedia.org/w/api.php?action=query&prop=extracts&titles=Threadless&format=json&exintro=1

//only explaining new parameters
//instead of revisions, we'll set prop=extracts
&prop=extracts

//if we just want the intro, we can use exintro. Otherwise it shows all sections
&exintro=1

All the info requires reading through the API documentation as was mentioned, but I hope these examples will help the majority of the people who come here for a quick fix.

够钟 2024-07-30 05:30:15

请参阅 http://www.mediawiki.org/wiki/API

特别是英文维基百科,API位于http://en.wikipedia.org/w/api.php

See http://www.mediawiki.org/wiki/API

Specifically, for the English Wikipedia, API is located at http://en.wikipedia.org/w/api.php

阳光的暖冬 2024-07-30 05:30:15

查看 ApiSandbox https://en.wikipedia.org/wiki/Special:ApiSandbox< /a> 这是一个可以轻松查询 API 的 Web 前端。 只需点击几下即可为您创建 URL 并显示 API 结果。

这是 MediaWiki 的扩展,支持所有维基百科语言。 https://www.mediawiki.org/wiki/Extension:ApiSandbox

Have a look at the ApiSandbox at https://en.wikipedia.org/wiki/Special:ApiSandbox That is a web frontend to easily query the API. A few clicks will craft you the URL and show you the API result.

That is an extension for MediaWiki, enabled on all Wikipedia languages. https://www.mediawiki.org/wiki/Extension:ApiSandbox

太阳公公是暖光 2024-07-30 05:30:15

如果你想从维基百科中提取结构化数据,你可以考虑使用 DbPedia http://dbpedia.org/

它提供了使用给定查询数据的方法使用 SPARQL 的条件并从解析的维基百科信息框模板返回数据

有一些 SPARQL 库可用于多个平台,使查询更容易

If you want to extract structured data from Wikipedia, you may consider using DbPedia http://dbpedia.org/

It provides means to query data using given criteria using SPARQL and returns data from parsed Wikipedia infobox templates

There are some SPARQL libraries available for multiple platforms to make queries easier

网白 2024-07-30 05:30:15

如果你想从维基百科中提取结构化数据,你也可以尝试
http://www.wikidata.org/wiki/Wikidata:Main_Page

If you want to extract structured data from Wikipedia, you may also try
http://www.wikidata.org/wiki/Wikidata:Main_Page

千里故人稀 2024-07-30 05:30:15

下面是一个工作示例,它将 Wikipedias New York Yankees 页面的第一句话打印到您的网络浏览器控制台:

<!DOCTYPE html>
</html>
    <head>
        <script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.3/jquery.min.js"></script>
    </head>
    <body>
        <script>
            var wikiUrl = "http://en.wikipedia.org/w/api.php?action=opensearch&search=New_York_Yankees&format=json&callback=wikiCallbackFunction";

            $.ajax(wikiUrl, {
                dataType: "jsonp",
                success: function( wikiResponse ) {
                    console.log( wikiResponse[2][0] );
                }
            });
        </script>   
    </body>
</html>

http://en.wikipedia.org/w/api.php 是您网址的端点。 您可以通过访问以下内容来了解​​如何构建您的网址:
http://www.mediawiki.org/wiki/API:Main_page

我使用 jsonp 作为允许跨站点请求的数据类型。 更多内容可以在这里找到:
http://www.mediawiki.org/wiki/API:Cross-site_requests

最后但并非最不重要的一点是,请确保引用 Jquery.ajax() API:
http://api.jquery.com/jquery.ajax/

Below is a working example that prints the first sentence from Wikipedias New York Yankees page to your web browsers console:

<!DOCTYPE html>
</html>
    <head>
        <script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.3/jquery.min.js"></script>
    </head>
    <body>
        <script>
            var wikiUrl = "http://en.wikipedia.org/w/api.php?action=opensearch&search=New_York_Yankees&format=json&callback=wikiCallbackFunction";

            $.ajax(wikiUrl, {
                dataType: "jsonp",
                success: function( wikiResponse ) {
                    console.log( wikiResponse[2][0] );
                }
            });
        </script>   
    </body>
</html>

http://en.wikipedia.org/w/api.php is the endpoint for your url. You can see how to structure your url by visiting:
http://www.mediawiki.org/wiki/API:Main_page

I used jsonp as the dataType to allow cross-site requests. More can be found here:
http://www.mediawiki.org/wiki/API:Cross-site_requests

Last but not least, make sure to reference the Jquery.ajax() API:
http://api.jquery.com/jquery.ajax/

故事和酒 2024-07-30 05:30:15

Wiki 解析器 将 Wikipedia 转储转换为 XML。 它也相当快。 然后,您可以使用任何 XML 处理工具来处理解析后的 Wikipedia 文章中的数据。

Wiki Parser converts Wikipedia dumps into XML. It is also quite fast. You can then use any XML processing tool to handle the data from the parsed Wikipedia articles.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文