MediaWiki 1.16.0 - 在 phpmyadmin 中选择当前文章

发布于 2024-11-14 11:36:24 字数 407 浏览 5 评论 0原文

我正在尝试获取 Mediawiki 1.16.0 中当前/最新文章的所有文章。我需要在 phpMyadmin 中执行此操作并从这些结果中转储。

我的 SQL:

SELECT 
  page.page_title, page.page_latest
  , revision.rev_id, revision.rev_text_id
  , text.old_id, text.old_text 
FROM page, revision, text 
WHERE rev_id = page_latest AND rev_text_id = old_id

我也得到了图像名称,但不是问题。我感觉上面这个SQL并没有获取最新版本的文章。

如果有一种方法可以在结果中不显示图像名称和重定向,它也会有所帮助。

I'm trying to get all articles that are the current/latest articles in Mediawiki 1.16.0. I need to do this in phpMyadmin and make a dump from those results.

my SQL:

SELECT 
  page.page_title, page.page_latest
  , revision.rev_id, revision.rev_text_id
  , text.old_id, text.old_text 
FROM page, revision, text 
WHERE rev_id = page_latest AND rev_text_id = old_id

I get the image names also but not a problem. I feel that this SQL above is not getting the latest version of the articles.

If there is a way to not show image names and redirects in the results it would also help.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

向日葵 2024-11-21 11:36:25

首先,请不要使用那种丑陋的隐式连接语法。这是令人困惑且容易出错的。

将其更改为:

SELECT 
  page.page_title, page.page_latest
  , revision.rev_id, revision.rev_text_id
  , text.old_id, text.old_text 
FROM page
INNER JOIN revision ON (rev_id = page_latest)
INNER JOIN text ON (rev_text_id = old_id)

现在您可以看到原因了:它正在获取所有页面。没有 where 子句,只有 join 子句。

这是数据库布局: http://upload.wikimedia.org/wikipedia/commons/b/b7/MediaWiki_database_schema_1-17_%28r82044%29.png

以下是各个表中字段的说明:
http://www.mediawiki.org/wiki/Manual:Database_layout

修订query

SELECT 
  p.page_title, p.page_latest
  , MAX(revision.rev_id) as rev_id, revision.rev_text_id
  , text.old_id, text.old_text 
FROM page p
INNER JOIN revision r ON (r.rev_id = p.page_latest)
INNER JOIN `text` t ON (r.rev_text_id = t.old_id)
WHERE p.page_is_redirect = 0 AND p.page_namespace <> 6  /*NS_IMAGE = 6*/
GROUP BY p.page_latest 
ORDER BY p.page_title

这会过滤掉重定向并排除 namespace = ns_image 的页面。

但我不是 100% 确定,因为我没有 MediaWiki 来测试。

First of all please don't use that ugly implicit join syntax. It's confusing and error-prone.

Change it to this:

SELECT 
  page.page_title, page.page_latest
  , revision.rev_id, revision.rev_text_id
  , text.old_id, text.old_text 
FROM page
INNER JOIN revision ON (rev_id = page_latest)
INNER JOIN text ON (rev_text_id = old_id)

Now you can see why: it's getting all pages. There is no where clause, there are just join clauses.

This is the DB layout: http://upload.wikimedia.org/wikipedia/commons/b/b7/MediaWiki_database_schema_1-17_%28r82044%29.png

And here are the description of the fields in the various tables:
http://www.mediawiki.org/wiki/Manual:Database_layout

Revised query

SELECT 
  p.page_title, p.page_latest
  , MAX(revision.rev_id) as rev_id, revision.rev_text_id
  , text.old_id, text.old_text 
FROM page p
INNER JOIN revision r ON (r.rev_id = p.page_latest)
INNER JOIN `text` t ON (r.rev_text_id = t.old_id)
WHERE p.page_is_redirect = 0 AND p.page_namespace <> 6  /*NS_IMAGE = 6*/
GROUP BY p.page_latest 
ORDER BY p.page_title

This filters out the redirects and excludes the pages where namespace = ns_image.

I'm not 100% sure though 'cause I don't have MediaWiki to test.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文