“安全” PHP 的 Markdown 处理器?

发布于 2024-07-22 11:05:09 字数 1077 浏览 3 评论 0原文

是否有适合在公共评论中使用的 Markdown 的 PHP 实现?

基本上它应该只允许 Markdown 语法的子集(粗体、斜体、链接、块引号、代码块和列表),并删除所有内联 HTML(或者可能转义它?)

我想一个选择是使用普通的 markdown 解析器,并通过 HTML 清理器运行输出,但是有更好的方法吗?

我们在网站的其余部分使用 PHP markdown Extra,因此我们已经必须使用辅助解析器(非“Extra”版本,因为脚注支持之类的东西是不必要的)..它看起来也更好只解析*bold* 文本并将所有内容转义为 &lt;a href="etc"&gt;,而不是生成 bold< /b> 文本并尝试删除我们不想要的位。

此外,在相关说明中,我们对“主”站点使用 WMD 控件,但对于评论,还有其他选项那里? WMD 的 javascript 预览很好,但它需要与 PHP markdown 处理器相同的“中性”(它不能显示图像等,否则有人会提交,他们的工作 markdown 将“中断”)

目前我的计划是使用PHP-markdown ->; HTML santiser 方法,并编辑 WMD 以从 showdown.js 中删除图像/标题语法 - 但似乎以前已经做过无数次了..

基本上:

  • 是否有一个“安全”的 Markdown 实现PHP?
  • 是否有一个 HTML/javascript markdown 编辑器可以轻松禁用相同的选项?

更新:我最终只是通过 HTML Purifier< 运行 markdown() 输出/a>.

这样,Markdown 渲染与输出清理是分开的,这更简单(两个大部分未修改的代码库),更安全(您不会尝试同时进行渲染和清理),并且更灵活(您可以进行多次清理)级别,例如针对可信内容的更宽松的配置,以及针对公众评论的更严格的版本)

Is there a PHP implementation of markdown suitable for using in public comments?

Basically it should only allow a subset of the markdown syntax (bold, italic, links, block-quotes, code-blocks and lists), and strip out all inline HTML (or possibly escape it?)

I guess one option is to use the normal markdown parser, and run the output through an HTML sanitiser, but is there a better way of doing this..?

We're using PHP markdown Extra for the rest of the site, so we'd already have to use a secondary parser (the non-"Extra" version, since things like footnote support is unnecessary).. It also seems nicer parsing only the *bold* text and having everything escaped to <a href="etc">, than generating <b>bold</b> text and trying to strip the bits we don't want..

Also, on a related note, we're using the WMD control for the "main" site, but for comments, what other options are there? WMD's javascript preview is nice, but it would need the same "neutering" as the PHP markdown processor (it can't display images and so on, otherwise someone will submit and their working markdown will "break")

Currently my plan is to use the PHP-markdown -> HTML santiser method, and edit WMD to remove the image/heading syntax from showdown.js - but it seems like this has been done countless times before..

Basically:

  • Is there a "safe" markdown implementation in PHP?
  • Is there a HTML/javascript markdown editor which could have the same options easily disabled?

Update: I ended up simply running the markdown() output through HTML Purifier.

This way the Markdown rendering was separate from output sanitisation, which is much simpler (two mostly-unmodified code bases) more secure (you're not trying to do both rendering and sanitisation at once), and more flexible (you can have multiple sanitisation levels, say a more lax configuration for trusted content, and a much more stringent version for public comments)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

枯叶蝶 2024-07-29 11:05:09

PHP Markdown 有一个清理选项,但它似乎没有在任何地方做广告。 看一下 markdown.phpMarkdown_Parser 类的顶部(从版本 1.0.1m 中的第 191 行开始)。 我们对第 209-211 行感兴趣:

# Change to `true` to disallow markup or entities.
var $no_markup = false;
var $no_entities = false;

如果将它们更改为 true,则标记和实体应分别进行转义,而不是逐字插入。 似乎没有任何内置方法可以更改这些内容(例如,通过构造函数),但您始终可以添加一个:

function do_markdown($text, $safe=false) {
    $parser = new Markdown_Parser;
    if ($safe) {
        $parser->no_markup = true;
        $parser->no_entities = true;
    }
    return $parser->transform($text);
}

请注意,上述函数在每次运行时都会创建一个新的解析器,而不是像提供的那样缓存它code>Markdown 函数(第 43-56 行)会执行此操作,因此可能会有点慢。

PHP Markdown has a sanitizer option, but it doesn't appear to be advertised anywhere. Take a look at the top of the Markdown_Parser class in markdown.php (starts on line 191 in version 1.0.1m). We're interested in lines 209-211:

# Change to `true` to disallow markup or entities.
var $no_markup = false;
var $no_entities = false;

If you change those to true, markup and entities, respectively, should be escaped rather than inserted verbatim. There doesn't appear to be any built-in way to change those (e.g., via the constructor), but you can always add one:

function do_markdown($text, $safe=false) {
    $parser = new Markdown_Parser;
    if ($safe) {
        $parser->no_markup = true;
        $parser->no_entities = true;
    }
    return $parser->transform($text);
}

Note that the above function creates a new parser on every run rather than caching it like the provided Markdown function (lines 43-56) does, so it might be a bit on the slow side.

仅此而已 2024-07-29 11:05:09

JavaScript Markdown 编辑器假设:

  • 使用 JavaScript 驱动的 Markdown 编辑器,例如,基于 showdown
  • 从工具栏中删除不需要的项目的所有图标和视觉线索
  • 设置 JavaScript 过滤器以清除提交时不需要的标记
  • 测试并强化所有 JavaScript 更改和本地计算机上的过滤器
  • 在 PHP 提交脚本中镜像这些过滤器,以在服务器端捕获相同的过滤器。
  • 从帮助/教程中删除所有对不需要的项目的引用

我用 JavaScript 创建了一个 Markdown 编辑器,但它具有增强的功能。 这花费了大量的时间和 SVN 修改。 但我认为改变 Markdown 编辑器来限制允许的 HTML 并不那么困难。

JavaScript Markdown Editor Hypothesis:

  • Use a JavaScript-driven Markdown Editor, e.g., based on showdown
  • Remove all icons and visual clues from the Toolbar for unwanted items
  • Set up a JavaScript filter to clean-up unwanted markup on submission
  • Test and harden all JavaScript changes and filters locally on your computer
  • Mirror those filters in the PHP submission script, to catch same on the server-side.
  • Remove all references to unwanted items from Help/Tutorials

I've created a Markdown editor in JavaScript, but it has enhanced features. That took a big chunk of time and SVN revisions. But I don't think it would be that tough to alter a Markdown editor to limit the HTML allowed.

溺渁∝ 2024-07-29 11:05:09

在通过 Markdown 处理之前,对用户输入的内容运行 htmlspecialchars 怎么样? 它应该逃避任何危险的东西,但保留 Markdown 理解的一切。

我正在尝试考虑一种情况,在这种情况下这不起作用,但无法立即想到任何事情。

How about running htmlspecialchars on the user entered input, before processing it through markdown? It should escape anything dangerous, but leave everything that markdown understands.

I'm trying to think of a case where this wouldn't work but can't think of anything off hand.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文