Mediawiki:渲染“脱离数据库” PHP 中的 wiki 文本为 HTML?

发布于 2024-11-19 23:23:38 字数 1781 浏览 5 评论 0 原文

情况是,我有一个私人维基,位于 http://mysite.com/wiki,它位于一个密码。我想做的是在同一台服务器上有一个单独的位置,可以读取带有 wiki 文本(代码)的任意文本文件,并使用 http://mysite.com/wiki 从中呈现 HTML(因为安装了模板/插件)。

例如,我将在 http://mysite.com 上有一个 /tmppub 目录;在其中,我有一个文本文件,其中包含 wiki 文本源代码,例如 example.wiki 和一个 process.php 页面;然后我会调用:

http://mysite.com/tmppub/process.php?file=Example.wiki

... 其中 process.php 将读取同一目录中的文件 Example.wiki,以某种方式将内容传递到 ../wiki 安装,并检索 HTML 输出并显示它。

我想,我想要的类似于 Mediawiki2HTML - gwtwiki - 如何转换 Mediawiki 文本中的示例到 HTML - Java Wikipedia API(Bliki 引擎) - 除了这个 Mediawiki2HTML 是 Java 语言(我想要 PHP)并且可能使用内部渲染引擎(我想要一个已经存在的 Mediawiki 特定安装)。

问题是,我可以编写一个 PHP 脚本,它将读取文件、处理 /wiki 的密码并传递 GET 和 POST 变量 - 但我不确定如何解决 Mediawiki 安装问题:

  • 我可以尝试伪造对 &action=edit 的调用(例如 编辑维基百科:沙盒)并请求预览;但这会返回编辑按钮和文本字段,我必须手动清理它们 - 不像
  • 我可以尝试解决 API,但正如我在 API:解析 wikitext - MediaWiki,它仅适用于 Mediawiki 安装中已有的页面 - 不适用于其中的页面。

最后,我想获取内容的原始 HTML(没有侧边栏等的 HTML),就像使用 动作参数 render (示例)。

 

如果已经有这样的 PHP 应用程序可用,是否有人知道 - 如果没有,解决 Mediawiki 安装问题的正确方法是什么,以获得 wiki 文本源的“原始”HTML 渲染?

预先感谢您的任何答复,
干杯!

The situation is, I have a private wiki, say at http://mysite.com/wiki, which is behind a password. What I'd like to do, is have a separate location on the same server, that could read arbitrary text files with wiki text (code), and use the particular engine of http://mysite.com/wiki to render HTML from it (because of installed templates/plugins).

As example, I would have a /tmppub directory on http://mysite.com; and in it, I'd have a text file with wiki text source code in it, say Example.wiki, and a process.php page; then I'd call:

http://mysite.com/tmppub/process.php?file=Example.wiki

... where process.php would read the file Example.wiki in the same directory, pass the contents somehow to the ../wiki installation, and retrieve the HTML output and display it.

I guess, what I want is similar to the example in Mediawiki2HTML - gwtwiki - How to convert Mediawiki text to HTML - Java Wikipedia API (Bliki engine) - except this Mediawiki2HTML is in Java (I'd want PHP) and possibly uses internal rendering engine (I'd want an already existing specific installation of Mediawiki).

The thing is, I can cook me up a PHP script which will read the file, handle the password of /wiki, and pass GET and POST variables - except I'm not sure how I would address the Mediawiki installation:

  • I could try to fake a call to &action=edit (e.g. Editing Wikipedia:Sandbox) and ask for a preview; but that would return the edit buttons and text fields, which I'd have to manually clean - no like
  • I could try to address the API, but as I can see in API:Parsing wikitext - MediaWiki, it will only work with pages already in the Mediawiki installation - not with pages off of it.

Finally, I'd like to obtain just the raw HTML of the content (without HTML for sidebars and such), as when using action parameter render (example).

 

Does anyone one if there is already such a PHP application available - and if not, what would be the proper way to address the Mediawiki installation, to obtain a 'raw' HTML rendering of the wiki text source?

Thanks in advance for any answers,
Cheers!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

作死小能手 2024-11-26 23:23:38

实际上,您甚至可以使用 API 来使用 parse 操作来解析自定义维基文本。 (title 参数可能有点误导,但它实际上只是使用 {{PAGENAME}} 时解析器的指针。) 解析现有页面,使用 render 操作。

如果身份验证是基于 HTTP 的,并且您有权访问 MediaWiki 安装,则可以滥用用于维护脚本的代码来加载重要内容并在此基础上进行解析。 (不过,这可能有点脏。)以下代码取自 includes/api/ApiParse.php 并进行了一些编辑(当然,根据您的需要调整文件路径)

require_once dirname( __FILE__ ) . '/w/maintenance/commandLine.inc';

$text = "* [[foo]]\n* [[Example|bar]]\n* [http://example.com/ an outside link]";
$titleObj = Title::newFromText( 'Example' );
$parserOptions = new ParserOptions();
$parserOptions->setTidy( true );

$parserOutput = $wgParser->parse( $text, $titleObj, $parserOptions );
$parsedText = $parserOutput->getText();

: HTML 现在位于 $parsedText 变量中。如果您需要对文本执行预保存转换(展开 {{subst:}}、波浪线到签名等),请查看 ApiParse.php 文件供参考。

You can actually use the API even to parse custom wikitext using the parse action. (The title parameter is maybe a bit misleading, but it's really just a pointer for the parser when using, for example, {{PAGENAME}}.) To parse existing page, the render action is used.

If the authentication is HTTP-based and you have access to the MediaWiki installation, you can abuse the code that is used for maintenance scripts to load the important stuff and parse on top of that. (This is maybe a little dirty, though.) The following code is taken from includes/api/ApiParse.php and edited a little (of course, adjust the file path to your needs):

require_once dirname( __FILE__ ) . '/w/maintenance/commandLine.inc';

$text = "* [[foo]]\n* [[Example|bar]]\n* [http://example.com/ an outside link]";
$titleObj = Title::newFromText( 'Example' );
$parserOptions = new ParserOptions();
$parserOptions->setTidy( true );

$parserOutput = $wgParser->parse( $text, $titleObj, $parserOptions );
$parsedText = $parserOutput->getText();

The parsed HTML is now in the $parsedText variable. If you need to perform pre-save transform on the text (expand {{subst:}}s, tildes to signatures, etc.), take a look to the ApiParse.php file for reference.

疯狂的代价 2024-11-26 23:23:38

有许多可用的 wiki 解析器 - http://www.mediawiki.org/wiki/Alternative_parsers

可以选择其中任何一个。您需要做的就是在它们周围放置一个简单的身份验证包装器,然后您可以将其用作服务。

There are many wiki parsers available - http://www.mediawiki.org/wiki/Alternative_parsers

You can choose any one of them. All you need to do is put a simple authentication wrapper around them and you could then use it as a service.

好听的两个字的网名 2024-11-26 23:23:38

感谢 @Matěj Grabovský 的答案< /a>;然而,我在让它工作时绊倒了几次,所以这里有一篇文章。

首先,我只是将答案中的代码保存为 mwparse.php,并尝试从网络浏览器中调用它 - 答案:“此脚本必须从命令行运行”。嗯,好吧:)事实证明这是使用 commandLine.inc 的要求。

因此,我登录到服务器 shell,尝试从 CLI 执行,然后得到:

$ cd /path/to/mwparse/
$ php -f mwparse.php
...
Exception caught inside exception handler: exception 'DBQueryError' with message 'A database error has occurred
Query: SELECT /* MessageCache::loadFromDB 127.0.0.1 * /  page_title  FROM MWPREFIX_page  WHERE page_is_redirect = '0' AND page_namespace = '8' AND (page_title not like '%%/%%') AND (page_len > 10000)
Function: doQuery
Error: HY000 no such table: MWPREFIX_page
' in /path/to/MyWiki/includes/db/Database.php:606
Stack trace:
....

...这是废话,因为 MyWiki 安装在从浏览器调用时有效 - 而且我也在 sqlitebrowser 中打开数据库以确认表 MWPREFIX_page 确实存在。 (Matěj 的回答 我在这里调用 /MyWiki

因此在尝试安装 xdebug 并调试之后使用该脚本(对我来说无法与 Mediawiki 一起使用,似乎是因为内存不断耗尽),我只是尝试运行此命令:

php -r "require_once dirname( __FILE__ ) . 'PREFIX/maintenance/commandLine.inc';"

...在不同的目录中,使用适当的 PREFIX。事实证明,可以在根 Mediawiki 安装中执行此行 - 也就是说,在本例中,在 MyWiki 文件夹中:

$ cd /path/to/MyWiki
$ php -r "require_once dirname( __FILE__ ) . '/maintenance/commandLine.inc';"
$

知道这一点后,我修改了 Matěj 的脚本 为:

<?
//~ error_reporting(E_ALL);
//~ ini_set('display_errors', '1');

chdir('../MyWiki);
//echo getcwd() . "\n"; // for debug check

require_once './maintenance/commandLine.inc';

$text = "* [[foo]]\n* [[Example|bar]]\n* [http://example.com/ an outside link]";

$titleObj = Title::newFromText( 'Example' );
$parserOptions = new ParserOptions();
$parserOptions->setTidy( true );

$parserOutput = $wgParser->parse( $text, $titleObj, $parserOptions );
$parsedText = $parserOutput->getText();

echo $parsedText;
?>

现在我可以从自己的目录运行脚本;但是,以下内容:

PHP Notice:  Undefined index: SERVER_NAME in /path/to/MyWiki/includes/Linker.php on line 888
Notice: Undefined index: SERVER_NAME in /path/to/MyWiki/includes/Linker.php on line 888

...可以在输出中看到。如果启用了 error_reporting,则 Notice 会出现 - PHPNotice 实际上位于 stderr 中。因此,为了从脚本中获取输出,我会在脚本的目录中调用:

php -f mwparse.php 2>/dev/null

为了在线获取该脚本,现在我只需编写一个在 CLI 中调用此脚本的 PHP 页面(可能使用 exec),这应该不是问题(除了 require_once . ..commandLine.inc 确实需要几秒钟的时间来执行,因此会对性能造成一定程度的影响)。

嗯,很高兴看到这个问题得到解决 - 再次感谢,
干杯!

 

PS:由于我在这方面花费了相当多的时间,因此我将在下面转储一些命令行日志(主要与xdebug的安装相关)。

from web: This script must be run from the command line

from remote terminal:

Exception caught inside exception handler: exception 'DBQueryError' with message 'A database error has occurred
Query: SELECT /* MessageCache::loadFromDB 127.0.0.1 * /  page_title  FROM MWPREFIX_page  WHERE page_is_redirect = '0' AND page_namespace = '8' AND (page_title not like '%%/%%') AND (page_len > 10000)
Function: doQuery
Error: HY000 no such table: MWPREFIX_page
' in /path/to/MyWiki/includes/db/Database.php:606
Stack trace:
....

PHP Deprecated:  Comments starting with '#' are deprecated in /etc/php5/cli/conf.d/mcrypt.ini on line 1 in Unknown on line 0
sdf

MediaWiki internal error.

Original exception: exception 'DBQueryError' with message 'A database error has occurred
Query: SELECT /* MediaWikiBagOStuff::_doquery 127.0.0.1 * / value,exptime FROM PREFIX_objectcache WHERE keyname='wikidb-MWPREFIX_:messages:en'
Function: doQuery
Error: HY000 no such table: MWPREFIX_objectcache
' in /path/to/MyWiki/includes/db/Database.php:606

http://www.apaddedcell.com/easy-php-debugging-ubuntu-using-xdebug-and-vim
https://stackoverflow.com/questions/1947395/how-can-i-debug-a-php-cli-script-with-xdebug

sudo apt-get install php-pear # pecl
sudo pecl install xdebug-beta # sh: phpize: not found
sudo apt-get install php5-dev # phpize; The following extra packages will be installed:   autoconf automake autotools-dev binutils gcc gcc-4.4 libc-dev-bin libc6-dev   libltdl-dev libssl-dev libtool linux-libc-dev m4 manpages-dev shtool   zlib1g-dev
sudo pecl install xdebug-beta # Installing '/usr/lib/php5/20090626+lfs/xdebug.so'

sudo nano /etc/php5/apache2/php.ini # zend_extension=/usr/lib/php5/20090626+lfs/xdebug.so and paste

sudo service apache2 restart # sudo /etc/init.d/apache2 restart

wget http://xdebug.org/files/xdebug-2.1.1.tgz # for debugclient
tar xzvf xdebug-2.1.1.tgz
rm package*.xml

cd xdebug-2.1.1/
$ cd debugclient
$ ./configure --with-libedit # configure: error: "libedit was not found on your system."
sudo apt-get install libedit2 # libedit2 is already the newest version.
sudo apt-get install libedit-dev # The following extra packages will be installed:   libbsd-dev libncurses5-dev
$ ./configure --with-libedit
$ make
# make install
./debugclient # Waiting for debug server to connect.

# open another remote terminal
export XDEBUG_CONFIG="idekey=session_name"
php mwparse.php
# flies by

# mediawiki started crashing upon adding ?XDEBUG_SESSION_START=1 to url, restart server

# now different errors:
# Deprecated: Call-time pass-by-reference has been deprecated in /path/to/MyWiki/includes/Article.php on line 1658 (http://www.emmajane.net/php-what-call-time-pass-reference-story)
# Notice: Undefined variable: wgBibPath in /path/to/MyWiki/extensions/Bibwiki/Bibwiki.i18n.php on line 116
# Fatal error: Allowed memory size of 20971520 bytes exhausted (tried to allocate 16 bytes) in /path/to/MyWiki/includes/GlobalFunctions.php on line 337

http://www.mediawiki.org/wiki/Manual:Errors_and_symptoms#Fatal_error:_Allowed_memory_size_of_nnnnnnn_bytes_exhausted_.28tried_to_allocate_nnnnnnnn_bytes.29

sudo nano /etc/php5/apache2/php.ini # comment out xdebug stuff
sudo service apache2 restart # now mediawiki works fine...

 

编辑说明:

  • 请注意,即使您设置 $wgDefaultUserOptions ['editsection' ] = false; 在你的LocalSettings.php,对上述脚本没有影响(尽管它在 Mediawiki 中有效) - 如果您想禁用 API 脚本渲染的编辑部分列表,脚本必须包含 $parserOptions->setEditSection( false );这通过 MediaWiki: ParserOptions Class
  • 由于在生产服务器上,我似乎没有权限运行 PHP: exec() (或者更确切地说,PHP: passthru()),或者可能没有运行 php-cli 的权限 - 所以我无法逐字使用上述解决方案,因为 commandLine.inc 将需要一个终端。但是,可以制作 commandLine.inc 的副本,并使用 $argv = array();unset($_SERVER); 对其进行“破解”,然后上面的解析器可以在网络服务器上下文中完全工作(但是,我不确定 commandLine.inc 的这种复制是否会带来安全风险?

Thanks @Matěj Grabovský for the answer; however, I tripped a couple of times while I got it to work, so here's a writeup.

First of all, I just saved the code from the answer as mwparse.php, and tried to call it from a web browser - the answer: "This script must be run from the command line". Ah well :) This turns out to be a requirement for using commandLine.inc.

So, I log in to the server shell, and I try to execute from CLI, and I get:

$ cd /path/to/mwparse/
$ php -f mwparse.php
...
Exception caught inside exception handler: exception 'DBQueryError' with message 'A database error has occurred
Query: SELECT /* MessageCache::loadFromDB 127.0.0.1 * /  page_title  FROM MWPREFIX_page  WHERE page_is_redirect = '0' AND page_namespace = '8' AND (page_title not like '%%/%%') AND (page_len > 10000)
Function: doQuery
Error: HY000 no such table: MWPREFIX_page
' in /path/to/MyWiki/includes/db/Database.php:606
Stack trace:
....

... which is bullcrap, since the MyWiki installation works when called from a browser - and I also opened the database in sqlitebrowser to confirm that, indeed, the table MWPREFIX_page exists. (what is refered to /w in Matěj's answer I call /MyWiki here)

So after an attempt to install xdebug and debug the script using that (which failed to work with Mediawiki for me, seemingly because memory kept getting exhausted), I simply tried to run this command:

php -r "require_once dirname( __FILE__ ) . 'PREFIX/maintenance/commandLine.inc';"

... in different directories, with appropriate PREFIX. Turns out, it is only possible to execute this line in the root Mediawiki installation - that is, in this case, in the MyWiki folder:

$ cd /path/to/MyWiki
$ php -r "require_once dirname( __FILE__ ) . '/maintenance/commandLine.inc';"
$

Knowing this, I modified Matěj's script into:

<?
//~ error_reporting(E_ALL);
//~ ini_set('display_errors', '1');

chdir('../MyWiki);
//echo getcwd() . "\n"; // for debug check

require_once './maintenance/commandLine.inc';

$text = "* [[foo]]\n* [[Example|bar]]\n* [http://example.com/ an outside link]";

$titleObj = Title::newFromText( 'Example' );
$parserOptions = new ParserOptions();
$parserOptions->setTidy( true );

$parserOutput = $wgParser->parse( $text, $titleObj, $parserOptions );
$parsedText = $parserOutput->getText();

echo $parsedText;
?>

Now I can run the script from its own directory; however, the following:

PHP Notice:  Undefined index: SERVER_NAME in /path/to/MyWiki/includes/Linker.php on line 888
Notice: Undefined index: SERVER_NAME in /path/to/MyWiki/includes/Linker.php on line 888

... can be seen in the output. The Notice is if error_reporting is enabled - the PHP Notice is actually in stderr. Thus, to get just the output from the script, in the script's directory I would call:

php -f mwparse.php 2>/dev/null

To get this online, now I'd just have to write a PHP page which calls this script in CLI (possibly using exec), which shouldn't be a problem (except that the require_once ... commandLine.inc does take a couple of seconds to execute, so it will be somewhat of a performance hit).

Well, glad to see this solved - thanks again,
Cheers!

 

PS: As I spent quite some time on that, I will be dumping somewhat of a command line log (mostly related to installation of xdebug) below.

from web: This script must be run from the command line

from remote terminal:

Exception caught inside exception handler: exception 'DBQueryError' with message 'A database error has occurred
Query: SELECT /* MessageCache::loadFromDB 127.0.0.1 * /  page_title  FROM MWPREFIX_page  WHERE page_is_redirect = '0' AND page_namespace = '8' AND (page_title not like '%%/%%') AND (page_len > 10000)
Function: doQuery
Error: HY000 no such table: MWPREFIX_page
' in /path/to/MyWiki/includes/db/Database.php:606
Stack trace:
....

PHP Deprecated:  Comments starting with '#' are deprecated in /etc/php5/cli/conf.d/mcrypt.ini on line 1 in Unknown on line 0
sdf

MediaWiki internal error.

Original exception: exception 'DBQueryError' with message 'A database error has occurred
Query: SELECT /* MediaWikiBagOStuff::_doquery 127.0.0.1 * / value,exptime FROM PREFIX_objectcache WHERE keyname='wikidb-MWPREFIX_:messages:en'
Function: doQuery
Error: HY000 no such table: MWPREFIX_objectcache
' in /path/to/MyWiki/includes/db/Database.php:606

http://www.apaddedcell.com/easy-php-debugging-ubuntu-using-xdebug-and-vim
https://stackoverflow.com/questions/1947395/how-can-i-debug-a-php-cli-script-with-xdebug

sudo apt-get install php-pear # pecl
sudo pecl install xdebug-beta # sh: phpize: not found
sudo apt-get install php5-dev # phpize; The following extra packages will be installed:   autoconf automake autotools-dev binutils gcc gcc-4.4 libc-dev-bin libc6-dev   libltdl-dev libssl-dev libtool linux-libc-dev m4 manpages-dev shtool   zlib1g-dev
sudo pecl install xdebug-beta # Installing '/usr/lib/php5/20090626+lfs/xdebug.so'

sudo nano /etc/php5/apache2/php.ini # zend_extension=/usr/lib/php5/20090626+lfs/xdebug.so and paste

sudo service apache2 restart # sudo /etc/init.d/apache2 restart

wget http://xdebug.org/files/xdebug-2.1.1.tgz # for debugclient
tar xzvf xdebug-2.1.1.tgz
rm package*.xml

cd xdebug-2.1.1/
$ cd debugclient
$ ./configure --with-libedit # configure: error: "libedit was not found on your system."
sudo apt-get install libedit2 # libedit2 is already the newest version.
sudo apt-get install libedit-dev # The following extra packages will be installed:   libbsd-dev libncurses5-dev
$ ./configure --with-libedit
$ make
# make install
./debugclient # Waiting for debug server to connect.

# open another remote terminal
export XDEBUG_CONFIG="idekey=session_name"
php mwparse.php
# flies by

# mediawiki started crashing upon adding ?XDEBUG_SESSION_START=1 to url, restart server

# now different errors:
# Deprecated: Call-time pass-by-reference has been deprecated in /path/to/MyWiki/includes/Article.php on line 1658 (http://www.emmajane.net/php-what-call-time-pass-reference-story)
# Notice: Undefined variable: wgBibPath in /path/to/MyWiki/extensions/Bibwiki/Bibwiki.i18n.php on line 116
# Fatal error: Allowed memory size of 20971520 bytes exhausted (tried to allocate 16 bytes) in /path/to/MyWiki/includes/GlobalFunctions.php on line 337

http://www.mediawiki.org/wiki/Manual:Errors_and_symptoms#Fatal_error:_Allowed_memory_size_of_nnnnnnn_bytes_exhausted_.28tried_to_allocate_nnnnnnnn_bytes.29

sudo nano /etc/php5/apache2/php.ini # comment out xdebug stuff
sudo service apache2 restart # now mediawiki works fine...

 

EDIT notes:

  • Note that even if you set $wgDefaultUserOptions ['editsection'] = false; in your LocalSettings.php, that has no effect on the above script (although it will have effect in Mediawiki proper) - if you want to disable edit section list for the API script rendering, the script must contain $parserOptions->setEditSection( false ); (this being set through MediaWiki: ParserOptions Class)
  • Since on production server, it seems I have no permission to run PHP: exec() (or rather, PHP: passthru()), or maybe no permission to run php-cli - so I cannot use the above solution verbatim, because commandLine.inc will demand a terminal. However, its possible to make a copy of commandLine.inc, and 'hack' it with $argv = array();unset($_SERVER);, and then the above parser may work fully from a webserver context (however, I'm not sure if this copying of commandLine.inc may represent a security risk?)
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文