是否有专门用于检索内容摘要的维基百科 API？

怀里藏娇 2024-12-28 22:48:13

有一种方法无需任何 HTML 解析即可获取整个“介绍部分”！类似于 AnthonyS 的回答通过附加 explaintext 参数，您可以获得纯文本的介绍部分文本。

查询

获取 Stack Overflow 的纯文本介绍：

使用页面标题：

https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro&explaintext&redirects=1&titles=Stack%20Overflow

或者使用pageids：

https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro&explaintext&redirects=1&pageids=21721040

JSON响应

（删除警告）

{
    "query": {
        "pages": {
            "21721040": {
                "pageid": 21721040,
                "ns": 0,
                "title": "Stack Overflow",
                "extract": "Stack Overflow is a privately held website, the flagship site of the Stack Exchange Network, created in 2008 by Jeff Atwood and Joel Spolsky, as a more open alternative to earlier Q&A sites such as Experts Exchange. The name for the website was chosen by voting in April 2008 by readers of Coding Horror, Atwood's popular programming blog.\nIt features questions and answers on a wide range of topics in computer programming. The website serves as a platform for users to ask and answer questions, and, through membership and active participation, to vote questions and answers up or down and edit questions and answers in a fashion similar to a wiki or Digg. Users of Stack Overflow can earn reputation points and \"badges\"; for example, a person is awarded 10 reputation points for receiving an \"up\" vote on an answer given to a question, and can receive badges for their valued contributions, which represents a kind of gamification of the traditional Q&A site or forum. All user-generated content is licensed under a Creative Commons Attribute-ShareAlike license. Questions are closed in order to allow low quality questions to improve. Jeff Atwood stated in 2010 that duplicate questions are not seen as a problem but rather they constitute an advantage if such additional questions drive extra traffic to the site by multiplying relevant keyword hits in search engines.\nAs of April 2014, Stack Overflow has over 2,700,000 registered users and more than 7,100,000 questions. Based on the type of tags assigned to questions, the top eight most discussed topics on the site are: Java, JavaScript, C#, PHP, Android, jQuery, Python and HTML."
            }
        }
    }
}

文档：API：query/prop=extracts

There's a way to get the entire "introduction section" without any HTML parsing! Similar to AnthonyS's answer with an additional explaintext parameter, you can get the introduction section text in plain text.

Query

Getting Stack Overflow's introduction in plain text:

Using the page title:

https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro&explaintext&redirects=1&titles=Stack%20Overflow

Or use pageids:

https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro&explaintext&redirects=1&pageids=21721040

JSON Response

(warnings stripped)

{
    "query": {
        "pages": {
            "21721040": {
                "pageid": 21721040,
                "ns": 0,
                "title": "Stack Overflow",
                "extract": "Stack Overflow is a privately held website, the flagship site of the Stack Exchange Network, created in 2008 by Jeff Atwood and Joel Spolsky, as a more open alternative to earlier Q&A sites such as Experts Exchange. The name for the website was chosen by voting in April 2008 by readers of Coding Horror, Atwood's popular programming blog.\nIt features questions and answers on a wide range of topics in computer programming. The website serves as a platform for users to ask and answer questions, and, through membership and active participation, to vote questions and answers up or down and edit questions and answers in a fashion similar to a wiki or Digg. Users of Stack Overflow can earn reputation points and \"badges\"; for example, a person is awarded 10 reputation points for receiving an \"up\" vote on an answer given to a question, and can receive badges for their valued contributions, which represents a kind of gamification of the traditional Q&A site or forum. All user-generated content is licensed under a Creative Commons Attribute-ShareAlike license. Questions are closed in order to allow low quality questions to improve. Jeff Atwood stated in 2010 that duplicate questions are not seen as a problem but rather they constitute an advantage if such additional questions drive extra traffic to the site by multiplying relevant keyword hits in search engines.\nAs of April 2014, Stack Overflow has over 2,700,000 registered users and more than 7,100,000 questions. Based on the type of tags assigned to questions, the top eight most discussed topics on the site are: Java, JavaScript, C#, PHP, Android, jQuery, Python and HTML."
            }
        }
    }
}

Documentation: API: query/prop=extracts

回复收藏 0 原文

百思不得你姐 2024-12-28 22:48:13

实际上有一个非常好的prop，名为extracts< /a> 可以与专门为此目的设计的查询一起使用。

摘录允许您获取文章摘录（截断的文章文本）。有一个名为 exintro 的参数，可用于检索第零部分中的文本（没有图像或信息框等附加资源）。您还可以检索更细粒度的摘录，例如按一定数量的字符 (exchars) 或按一定数量的句子 (exsentences)。

这是一个示例查询 http://en.wikipedia.org/w/api.php?action=query&prop=extracts&format=json&exintro=&titles=Stack%20Overflow
和 API 沙箱 http://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&prop=extracts&format=json&exintro=&titles=Stack%20Overflow 对此查询进行更多实验。

请注意，如果您特别想要第一段，您仍然需要按照所选答案中的建议进行一些额外的解析。此处的区别在于，此查询返回的响应比建议的其他一些 API 查询短，因为您没有需要解析的 API 响应中的其他资产（例如图像）。

文档中的警告：

我们不建议使用exsentences。它不适用于 HTML 提取，并且有许多边缘情况它不存在。例如“陆军上将约翰·史密斯是一名士兵。”将被视为 4 个句子。我们不打算解决这个问题。

回复收藏 0 原文

在巴黎塔顶看东京樱花 2024-12-28 22:48:13

自 2017 年起，维基百科提供了具有更好缓存的 REST API。在文档中，您可以找到以下完全适合您的 API用例（因为它被新的页面预览使用特征）。

https://en.wikipedia.org/api/rest_v1/page/summary/Stack_Overflow
返回以下数据，可用于显示带有小缩略图的摘要：

{
  "type": "standard",
  "title": "Stack Overflow",
  "displaytitle": "<span class=\"mw-page-title-main\">Stack Overflow</span>",
  "namespace": {
    "id": 0,
    "text": ""
  },
  "wikibase_item": "Q549037",
  "titles": {
    "canonical": "Stack_Overflow",
    "normalized": "Stack Overflow",
    "display": "<span class=\"mw-page-title-main\">Stack Overflow</span>"
  },
  "pageid": 21721040,
  "thumbnail": {
    "source": "https://upload.wikimedia.org/wikipedia/commons/thumb/a/a5/StackOverflow.com_Top_Questions_Page_Screenshot.png/320px-StackOverflow.com_Top_Questions_Page_Screenshot.png",
    "width": 320,
    "height": 144
  },
  "originalimage": {
    "source": "https://upload.wikimedia.org/wikipedia/commons/a/a5/StackOverflow.com_Top_Questions_Page_Screenshot.png",
    "width": 1920,
    "height": 865
  },
  "lang": "en",
  "dir": "ltr",
  "revision": "1136271608",
  "tid": "a5580980-9fe9-11ed-8bcd-ff7b011c142c",
  "timestamp": "2023-01-29T15:28:54Z",
  "description": "Website hosting questions and answers on a wide range of topics in computer programming",
  "description_source": "local",
  "content_urls": {
    "desktop": {
      "page": "https://en.wikipedia.org/wiki/Stack_Overflow",
      "revisions": "https://en.wikipedia.org/wiki/Stack_Overflow?action=history",
      "edit": "https://en.wikipedia.org/wiki/Stack_Overflow?action=edit",
      "talk": "https://en.wikipedia.org/wiki/Talk:Stack_Overflow"
    },
    "mobile": {
      "page": "https://en.m.wikipedia.org/wiki/Stack_Overflow",
      "revisions": "https://en.m.wikipedia.org/wiki/Special:History/Stack_Overflow",
      "edit": "https://en.m.wikipedia.org/wiki/Stack_Overflow?action=edit",
      "talk": "https://en.m.wikipedia.org/wiki/Talk:Stack_Overflow"
    }
  },
  "extract": "Stack Overflow is a question and answer website for professional and enthusiast programmers. It is the flagship site of the Stack Exchange Network. It was created in 2008 by Jeff Atwood and Joel Spolsky. It features questions and answers on a wide range of topics in computer programming. It was created to be a more open alternative to earlier question and answer websites such as Experts-Exchange. Stack Overflow was sold to Prosus, a Netherlands-based consumer internet conglomerate, on 2 June 2021 for $1.8 billion.",
  "extract_html": "<p><b>Stack Overflow</b> is a question and answer website for professional and enthusiast programmers. It is the flagship site of the Stack Exchange Network. It was created in 2008 by Jeff Atwood and Joel Spolsky. It features questions and answers on a wide range of topics in computer programming. It was created to be a more open alternative to earlier question and answer websites such as Experts-Exchange. Stack Overflow was sold to Prosus, a Netherlands-based consumer internet conglomerate, on 2 June 2021 for $1.8 billion.</p>"
}

默认情况下，它遵循重定向（因此 /api/rest_v1/page/summary/StackOverflow 也可以工作），但这可以使用 ?redirect=false 禁用。

如果您需要从另一个域访问 API，您可以使用 CORS 标头设置 <代码>&origin=（例如，&origin=*）。

截至 2019 年：API 似乎返回了有关页面的更多有用信息。

Since 2017 Wikipedia provides a REST API with better caching. In the documentation you can find the following API which perfectly fits your use case (as it is used by the new Page Previews feature).

https://en.wikipedia.org/api/rest_v1/page/summary/Stack_Overflow
returns the following data which can be used to display a summary with a small thumbnail:

{
  "type": "standard",
  "title": "Stack Overflow",
  "displaytitle": "<span class=\"mw-page-title-main\">Stack Overflow</span>",
  "namespace": {
    "id": 0,
    "text": ""
  },
  "wikibase_item": "Q549037",
  "titles": {
    "canonical": "Stack_Overflow",
    "normalized": "Stack Overflow",
    "display": "<span class=\"mw-page-title-main\">Stack Overflow</span>"
  },
  "pageid": 21721040,
  "thumbnail": {
    "source": "https://upload.wikimedia.org/wikipedia/commons/thumb/a/a5/StackOverflow.com_Top_Questions_Page_Screenshot.png/320px-StackOverflow.com_Top_Questions_Page_Screenshot.png",
    "width": 320,
    "height": 144
  },
  "originalimage": {
    "source": "https://upload.wikimedia.org/wikipedia/commons/a/a5/StackOverflow.com_Top_Questions_Page_Screenshot.png",
    "width": 1920,
    "height": 865
  },
  "lang": "en",
  "dir": "ltr",
  "revision": "1136271608",
  "tid": "a5580980-9fe9-11ed-8bcd-ff7b011c142c",
  "timestamp": "2023-01-29T15:28:54Z",
  "description": "Website hosting questions and answers on a wide range of topics in computer programming",
  "description_source": "local",
  "content_urls": {
    "desktop": {
      "page": "https://en.wikipedia.org/wiki/Stack_Overflow",
      "revisions": "https://en.wikipedia.org/wiki/Stack_Overflow?action=history",
      "edit": "https://en.wikipedia.org/wiki/Stack_Overflow?action=edit",
      "talk": "https://en.wikipedia.org/wiki/Talk:Stack_Overflow"
    },
    "mobile": {
      "page": "https://en.m.wikipedia.org/wiki/Stack_Overflow",
      "revisions": "https://en.m.wikipedia.org/wiki/Special:History/Stack_Overflow",
      "edit": "https://en.m.wikipedia.org/wiki/Stack_Overflow?action=edit",
      "talk": "https://en.m.wikipedia.org/wiki/Talk:Stack_Overflow"
    }
  },
  "extract": "Stack Overflow is a question and answer website for professional and enthusiast programmers. It is the flagship site of the Stack Exchange Network. It was created in 2008 by Jeff Atwood and Joel Spolsky. It features questions and answers on a wide range of topics in computer programming. It was created to be a more open alternative to earlier question and answer websites such as Experts-Exchange. Stack Overflow was sold to Prosus, a Netherlands-based consumer internet conglomerate, on 2 June 2021 for $1.8 billion.",
  "extract_html": "<p><b>Stack Overflow</b> is a question and answer website for professional and enthusiast programmers. It is the flagship site of the Stack Exchange Network. It was created in 2008 by Jeff Atwood and Joel Spolsky. It features questions and answers on a wide range of topics in computer programming. It was created to be a more open alternative to earlier question and answer websites such as Experts-Exchange. Stack Overflow was sold to Prosus, a Netherlands-based consumer internet conglomerate, on 2 June 2021 for $1.8 billion.</p>"
}

By default, it follows redirects (so that /api/rest_v1/page/summary/StackOverflow also works), but this can be disabled with ?redirect=false.

If you need to access the API from another domain you can set the CORS header with &origin= (e.g., &origin=*).

As of 2019: The API seems to return more useful information about the page.

回复收藏 0 原文

卖梦商人 2024-12-28 22:48:13

此代码允许您以纯文本形式检索页面第一段的内容。

这个答案的部分内容来自此处，因此这里。有关详细信息，请参阅 MediaWiki API 文档。

// action=parse: get parsed text
// page=Baseball: from the page Baseball
// format=json: in JSON format
// prop=text: send the text content of the article
// section=0: top content of the page

$url = 'http://en.wikipedia.org/w/api.php?format=json&action=parse&page=Baseball&prop=text§ion=0';
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, "TestScript"); // required by wikipedia.org server; use YOUR user agent with YOUR contact information. (otherwise your IP might get blocked)
$c = curl_exec($ch);

$json = json_decode($c);

$content = $json->{'parse'}->{'text'}->{'*'}; // Get the main text content of the query (it's parsed HTML)

// Pattern for first match of a paragraph
$pattern = '#<p>(.*)</p>#Us'; // http://www.phpbuilder.com/board/showthread.php?t=10352690
if(preg_match($pattern, $content, $matches))
{
    // print $matches[0]; // Content of the first paragraph (including wrapping <p> tag)
    print strip_tags($matches[1]); // Content of the first paragraph without the HTML tags.
}

This code allows you to retrieve the content of the first paragraph of the page in plain text.

Parts of this answer come from here and thus here. See MediaWiki API documentation for more information.

// action=parse: get parsed text
// page=Baseball: from the page Baseball
// format=json: in JSON format
// prop=text: send the text content of the article
// section=0: top content of the page

$url = 'http://en.wikipedia.org/w/api.php?format=json&action=parse&page=Baseball&prop=text§ion=0';
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, "TestScript"); // required by wikipedia.org server; use YOUR user agent with YOUR contact information. (otherwise your IP might get blocked)
$c = curl_exec($ch);

$json = json_decode($c);

$content = $json->{'parse'}->{'text'}->{'*'}; // Get the main text content of the query (it's parsed HTML)

// Pattern for first match of a paragraph
$pattern = '#<p>(.*)</p>#Us'; // http://www.phpbuilder.com/board/showthread.php?t=10352690
if(preg_match($pattern, $content, $matches))
{
    // print $matches[0]; // Content of the first paragraph (including wrapping <p> tag)
    print strip_tags($matches[1]); // Content of the first paragraph without the HTML tags.
}

回复收藏 0 原文

橘寄 2024-12-28 22:48:13

是的，有。例如，如果您想获取文章第一部分的内容 Stack Overflow，请使用如下查询：

<一href="http://en.wikipedia.org/w/api.php?format=xml&action=query&prop=revisions&titles=Stack%20Overflow&rvprop=content&rvsection=0&rvparse ">http://en.wikipedia.org/w/api.php?format=xml&action=query&prop=revisions&titles=Stack%20Overflow&rvprop=content&rvsection=0&rvparse

这些部分的含义如下：

format=xml：以 XML 形式返回结果格式化程序。其他选项（如 JSON）也可用。这不会影响页面内容本身的格式，只会影响包含的数据格式。
action=query&prop=revisions：获取有关页面修订的信息。由于我们没有指定哪个修订版本，因此使用最新版本。
titles=Stack%20Overflow：获取有关页面Stack Overflow的信息。如果用 | 分隔页面名称，则可以一次性获取多个页面的文本。
rvprop=content：返回修订版的内容（或文本）。
rvsection=0：仅返回第 0 部分的内容。
rvparse：返回解析为 HTML 的内容。

请记住，这会返回整个第一部分，包括帽注（“用于其他用途……”）、信息框或图像等内容。

有几个可用于各种语言的库，可以使 API 的使用变得更容易，如果您使用其中之一可能会更好。

回复收藏 0 原文

空‖城人不在 2024-12-28 22:48:13

这是我现在正在制作的网站的代码，该网站需要获取维基百科文章的主要段落、摘要和第 0 节，并且这一切都是在浏览器（客户端 JavaScript）内完成的，这要归功于JSONP 的魔力！ --> http://jsfiddle.net/gautamadude/HMJJg/1/

它使用维基百科 API获取 HTML 中的前导段落（称为第 0 节），如下所示： http://en.wikipedia.org/w/api.php?format=json&action=parse&page=Stack_Overflow&prop=text§ion=0&callback=?

然后它会去除 HTML 和其他不需要的数据，为您提供干净的文章摘要字符串。如果您愿意，可以稍作调整，在前导段落周围添加一个“p”HTML 标记，但现在它们之间只有一个换行符。

代码：

var url = "http://en.wikipedia.org/wiki/Stack_Overflow";
var title = url.split("/").slice(4).join("/");

// Get leading paragraphs (section 0)
$.getJSON("http://en.wikipedia.org/w/api.php?format=json&action=parse&page=" + title + "&prop=text§ion=0&callback=?", function (data) {
    for (text in data.parse.text) {
        var text = data.parse.text[text].split("<p>");
        var pText = "";

        for (p in text) {
            // Remove HTML comment
            text[p] = text[p].split("<!--");
            if (text[p].length > 1) {
                text[p][0] = text[p][0].split(/\r\n|\r|\n/);
                text[p][0] = text[p][0][0];
                text[p][0] += "</p> ";
            }
            text[p] = text[p][0];

            // Construct a string from paragraphs
            if (text[p].indexOf("</p>") == text[p].length - 5) {
                var htmlStrip = text[p].replace(/<(?:.|\n)*?>/gm, '') // Remove HTML
                var splitNewline = htmlStrip.split(/\r\n|\r|\n/); //Split on newlines
                for (newline in splitNewline) {
                    if (splitNewline[newline].substring(0, 11) != "Cite error:") {
                        pText += splitNewline[newline];
                        pText += "\n";
                    }
                }
            }
        }
        pText = pText.substring(0, pText.length - 2); // Remove extra newline
        pText = pText.replace(/\[\d+\]/g, ""); // Remove reference tags (e.x. [1], [4], etc)
        document.getElementById('textarea').value = pText
        document.getElementById('div_text').textContent = pText
    }
});

This is the code I'm using right now for a website I'm making that needs to get the leading paragraphs, summary, and section 0 of off Wikipedia articles, and it's all done within the browser (client-side JavaScript) thanks to the magic of JSONP! --> http://jsfiddle.net/gautamadude/HMJJg/1/

It uses the Wikipedia API to get the leading paragraphs (called section 0) in HTML like so: http://en.wikipedia.org/w/api.php?format=json&action=parse&page=Stack_Overflow&prop=text§ion=0&callback=?

It then strips the HTML and other undesired data, giving you a clean string of an article summary. If you want you can, with a little tweaking, get a "p" HTML tag around the leading paragraphs, but right now there is just a newline character between them.

Code:

var url = "http://en.wikipedia.org/wiki/Stack_Overflow";
var title = url.split("/").slice(4).join("/");

// Get leading paragraphs (section 0)
$.getJSON("http://en.wikipedia.org/w/api.php?format=json&action=parse&page=" + title + "&prop=text§ion=0&callback=?", function (data) {
    for (text in data.parse.text) {
        var text = data.parse.text[text].split("<p>");
        var pText = "";

        for (p in text) {
            // Remove HTML comment
            text[p] = text[p].split("<!--");
            if (text[p].length > 1) {
                text[p][0] = text[p][0].split(/\r\n|\r|\n/);
                text[p][0] = text[p][0][0];
                text[p][0] += "</p> ";
            }
            text[p] = text[p][0];

            // Construct a string from paragraphs
            if (text[p].indexOf("</p>") == text[p].length - 5) {
                var htmlStrip = text[p].replace(/<(?:.|\n)*?>/gm, '') // Remove HTML
                var splitNewline = htmlStrip.split(/\r\n|\r|\n/); //Split on newlines
                for (newline in splitNewline) {
                    if (splitNewline[newline].substring(0, 11) != "Cite error:") {
                        pText += splitNewline[newline];
                        pText += "\n";
                    }
                }
            }
        }
        pText = pText.substring(0, pText.length - 2); // Remove extra newline
        pText = pText.replace(/\[\d+\]/g, ""); // Remove reference tags (e.x. [1], [4], etc)
        document.getElementById('textarea').value = pText
        document.getElementById('div_text').textContent = pText
    }
});

回复收藏 0 原文

又爬满兰若 2024-12-28 22:48:13

此 URL 将返回 XML 格式的摘要。

http://lookup.dbpedia.org/api/search.asmx/KeywordSearch?QueryString=Agra&MaxHits=1

我创建了一个函数来从维基百科获取关键字的描述。

function getDescription($keyword) {
    $url = 'http://lookup.dbpedia.org/api/search.asmx/KeywordSearch?QueryString=' . urlencode($keyword) . '&MaxHits=1';
    $xml = simplexml_load_file($url);
    return $xml->Result->Description;
}

echo getDescription('agra');

This URL will return summary in XML format.

http://lookup.dbpedia.org/api/search.asmx/KeywordSearch?QueryString=Agra&MaxHits=1

I have created a function to fetch description of a keyword from Wikipedia.

function getDescription($keyword) {
    $url = 'http://lookup.dbpedia.org/api/search.asmx/KeywordSearch?QueryString=' . urlencode($keyword) . '&MaxHits=1';
    $xml = simplexml_load_file($url);
    return $xml->Result->Description;
}

echo getDescription('agra');

回复收藏 0 原文

浊酒尽余欢 2024-12-28 22:48:13

您还可以通过 DBPedia 获取诸如第一段之类的内容，它采用维基百科内容并从中创建结构化信息（RDF）并通过 API 提供此功能。 DBPedia API 是一种 SPARQL API（基于 RDF），但它输出 JSON 并且非常容易包装。

作为示例，这里有一个名为 WikipediaJS 的超级简单 JavaScript 库，它可以提取结构化内容，包括摘要第一段。

您可以在这篇博文中阅读更多相关信息：WikipediaJS - 通过 Javascript 访问维基百科文章数据

JavaScript 库代码可以在 wikipedia.js。

回复收藏 0 原文

你列表最软的妹 2024-12-28 22:48:13

abstract.xml.gz 转储声音就像你想要的那样。

回复收藏 0 原文

阪姬 2024-12-28 22:48:13

如果您只是查找文本（然后可以将其拆分），但不想使用 API，请查看 en.wikipedia.org/w/index.php?title=Elephant&action =原始。

回复收藏 0 原文

寒尘 2024-12-28 22:48:13

我的方法如下（在 PHP 中）：

$url = "whatever_you_need"

$html = file_get_contents('https://en.wikipedia.org/w/api.php?action=opensearch&search='.$url);
$utf8html = html_entity_decode(preg_replace("/U\+([0-9A-F]{4})/", "&#x\\1;", $html), ENT_NOQUOTES, 'UTF-8');

$utf8html 可能需要进一步清理，但基本上就是这样。

My approach was as follows (in PHP):

$url = "whatever_you_need"

$html = file_get_contents('https://en.wikipedia.org/w/api.php?action=opensearch&search='.$url);
$utf8html = html_entity_decode(preg_replace("/U\+([0-9A-F]{4})/", "&#x\\1;", $html), ENT_NOQUOTES, 'UTF-8');

$utf8html might need further cleaning, but that's basically it.

回复收藏 0 原文

愿与i 2024-12-28 22:48:13

我尝试了迈克尔·拉帕达斯< /a> 和 @Krinkle 的解决方案，但就我而言，我很难根据大小写找到一些文章。就像这里：

https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro=&exsentences=1&explaintext=&titles=Led%20zeppelin

注意，我用 exsentences=1 截断了响应，

显然“标题规范化”无法正常工作：

标题规范化将页面标题转换为其规范形式。这
表示第一个字符大写，下划线替换为
空格，并将命名空间更改为为此定义的本地化形式
维基百科。标题规范化是自动完成的，无论哪个
使用查询模块。但是，页面中的任何尾随换行符
标题 (\n) 会导致奇怪的行为，应该将其删除
首先。

我知道我可以轻松解决大小写问题，但也存在必须将对象转换为数组的不便。

因为我真的想要一个众所周知且定义的搜索的第一段（没有从其他文章中获取信息的风险），所以我这样做了：

https://en.wikipedia.org/w/api.php?action=opensearch&search=led%20zeppelin&limit=1&format=json

请注意，在这种情况下，我使用 < 进行了截断code>limit=1

这样：

我可以非常轻松地访问响应数据。
反应很小。

但我们必须谨慎对待搜索的大小写。

回复收藏 0 原文

窗影残 2024-12-28 22:48:13

现在，维基媒体企业版提供了一种更简单的方法，其中包含 abstract 字段。 https://enterprise.wikimedia.com/docs/data-dictionary/#abstract< /a> 在 v2/articles 端点 https://enterprise.wikimedia.com/docs/on-demand/

回复收藏 0 原文

是否有专门用于检索内容摘要的维基百科 API？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（13）

查询

JSON响应

Query

JSON Response

关于作者

相关话题

热门标签

推荐作者

琉璃梦幻

qq_4zWU6L

话少情深

西西弗的石头怪

彻夜缠绵

千寻…

友情链接

是否有专门用于检索内容摘要的维基百科 API？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（13）

查询

JSON响应

Query

JSON Response

关于作者

相关话题

热门标签

推荐作者

琉璃梦幻

qq_4zWU6L

话少情深

西西弗的石头怪

彻夜缠绵

千寻…

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。