从 URL 中删除 JavaScript

发布于 2024-10-14 23:27:02 字数 178 浏览 7 评论 0原文

我正在编写一个服务器端脚本,它将文本正文中的所有 URL 替换为 标记版本(以便可以单击它们)。

如何确保我转换的任何 url 中不包含任何 XSS 样式的 javascript?

我目前正在字符串中过滤“javascript:”,但我觉得这可能还不够。

I'm writing a sever-side script that replaces all URLs in a body of text with <a/> tag versions (so they can be clicked).

How can I make sure that any urls I convert do not contain any XSS style javascripts in them?

I'm currently filtering for "javascript:" in the string, but I feel that is likely not sufficient..

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

爱冒险 2024-10-21 23:27:02

任何现代服务器端语言都有 Markdown 或其他轻量级标记语言的某种实现。这些标记语言用可点击的链接替换 ​​URL。

除非您花费大量时间来研究这个主题并实现这个脚本,否则我建议您找到您语言中最好的 Markdown 实现并挖掘其代码,或者简单地在您的语言中使用它。代码。

Markdown 通常作为库提供;其中一些允许您配置它们必须处理的内容以及必须忽略的内容 - 在您的情况下,您想要处理 URL,忽略任何其他元素。

以下是不同语言的可靠 Markdown 实现的(不完整)列表:

Any modern server-side language has some sort of implementation of Markdown or other lightweight markup languages. Those markup languages replace URLs with a clickable link.

Unless you have a lot of time to spend to research about this topic and implement this script, I'd suggest to spot the best Markdown implementation in your language and dig its code, or simply use it in your code.

Markdown is usually shipped as a library; some of them let you configure what they have to process and what they have to ignore – in your case you want to process URL, ignoring any other element.

Here's an (incomplete) list of solid Markdown implementations for different languages:

与他有关 2024-10-21 23:27:02

您需要对 URL 进行属性编码。
您还应该确保它们以 http://https:// 开头。

You need to attribute-encode the URLs.
You should also make sure that they start with http:// or https://.

倒带 2024-10-21 23:27:02

这是取自 Kohana 框架,与 XSS 过滤相关。这不是一个完整的答案,但可能会让你上路。

// Remove javascript: and vbscript: protocols
$str = preg_replace('#([a-z]*)[\x00-\x20]*=[\x00-\x20]*([`\'"]*)[\x00-\x20]*j[\x00-\x20]*a[\x00-\x20]*v[\x00-\x20]*a[\x00-\x20]*s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:#iu', '$1=$2nojavascript...', $str);
$str = preg_replace('#([a-z]*)[\x00-\x20]*=([\'"]*)[\x00-\x20]*v[\x00-\x20]*b[\x00-\x20]*s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:#iu', '$1=$2novbscript...', $str);
$str = preg_replace('#([a-z]*)[\x00-\x20]*=([\'"]*)[\x00-\x20]*-moz-binding[\x00-\x20]*:#u', '$1=$2nomozbinding...', $str);

// Only works in IE: <span style="width: expression(alert('Ping!'));"></span>
$str = preg_replace('#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?expression[\x00-\x20]*\([^>]*+>#is', '$1>', $str);
$str = preg_replace('#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?behaviour[\x00-\x20]*\([^>]*+>#is', '$1>', $str);
$str = preg_replace('#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:*[^>]*+>#ius', '$1>', $str);

This was taken from Kohana framework, related to XSS filtering. Not a complete answer, but might get you on the way.

// Remove javascript: and vbscript: protocols
$str = preg_replace('#([a-z]*)[\x00-\x20]*=[\x00-\x20]*([`\'"]*)[\x00-\x20]*j[\x00-\x20]*a[\x00-\x20]*v[\x00-\x20]*a[\x00-\x20]*s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:#iu', '$1=$2nojavascript...', $str);
$str = preg_replace('#([a-z]*)[\x00-\x20]*=([\'"]*)[\x00-\x20]*v[\x00-\x20]*b[\x00-\x20]*s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:#iu', '$1=$2novbscript...', $str);
$str = preg_replace('#([a-z]*)[\x00-\x20]*=([\'"]*)[\x00-\x20]*-moz-binding[\x00-\x20]*:#u', '$1=$2nomozbinding...', $str);

// Only works in IE: <span style="width: expression(alert('Ping!'));"></span>
$str = preg_replace('#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?expression[\x00-\x20]*\([^>]*+>#is', '$1>', $str);
$str = preg_replace('#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?behaviour[\x00-\x20]*\([^>]*+>#is', '$1>', $str);
$str = preg_replace('#(<[^>]+?)style[\x00-\x20]*=[\x00-\x20]*[`\'"]*.*?s[\x00-\x20]*c[\x00-\x20]*r[\x00-\x20]*i[\x00-\x20]*p[\x00-\x20]*t[\x00-\x20]*:*[^>]*+>#ius', '$1>', $str);
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文