php - 使用 preg_replace_callback 和 ord() 清理用户输入？

发布于 2024-12-11 12:46:43 字数 1128 浏览 6 评论 0原文

我有一个论坛样式的文本框，我想清理用户输入以阻止潜在的 xss 和代码插入。我见过使用 htmlentities，但后来其他人说 &,#,%,: 字符也需要编码，而且似乎我越看，弹出的潜在危险字符就越多。白名单是有问题的，因为除了 ^a-zA-z0-9 之外还有许多有效的文本选项。我想出了这段代码。它能阻止攻击并确保安全吗？有什么理由不使用它，或者有更好的方法吗？

function replaceHTML ($match) {
    return "&#" . ord ($match[0]) . ";";
}

$clean = preg_replace_callback ( "/[^ a-zA-Z0-9]/", "replaceHTML", $userInput );

编辑：_______________ ______________ 我当然可能是错的，但我的理解是 htmlentities 只替换 & < > “（并且'如果 ENT_QUOTES 打开）。这可能足以阻止大多数攻击（坦率地说，对于我的低流量网站来说可能绰绰有余）。然而，在我对细节的痴迷关注中，我进一步挖掘。我有一本书警告还对 # 和 % 进行编码以表示“关闭十六进制攻击”。我发现两个网站警告不允许使用 : 和 -- ，这让我很困惑，并引导我探索转换所有非字母数字字符。 htmlentities 已经做到了这一点，但似乎并不好。以下是我在 firefox 中单击“查看源代码”后复制的代码的结果

（要测试的随机字符）： 5:gjla#''*&$!jl:4

preg_replace_callback: 5:gjla#''*&$!jl:4

htmlentities (w/ ENT_QUOTES): 5:gjla#''*&$!jl:4

htmlentities 似乎没有对其他字符进行编码，例如：抱歉，文字墙。这只是我偏执吗？

编辑＃2：___________

原文

I have a forum style text box and I would like to sanitize the user input to stop potential xss and code insertion. I have seen htmlentities used, but then others have said that &,#,%,: characters need to be encoded as well, and it seems the more I look, the more potentially dangerous characters pop up. Whitelisting is problematic as there are many valid text options beyond ^a-zA-z0-9. I have come up with this code. Will it work to stop attacks and be secure? Is there any reason not to use it, or a better way?

function replaceHTML ($match) {
    return "&#" . ord ($match[0]) . ";";
}

$clean = preg_replace_callback ( "/[^ a-zA-Z0-9]/", "replaceHTML", $userInput );

EDIT:_____________________________
I could of course be wrong, but it is my understanding that htmlentities only replaces & < > " (and ' if ENT_QUOTES is turned on). This is probably enough to stop most attacks (and frankly probably more than enough for my low traffic site). In my obsessive attention to detail, however, I dug further. A book I have warns to also encode # and % for "shutting down hex attacks". Two websites I found warned against allowing : and --. Its all rather confusing to me, and led me to explore converting all non-alphanumeric characters. If htmlentities does this already then great, but it does not seem to. Here are results from code I ran I copied after clicking view source in firefox.

original (random characters to test):
5:gjla#''*&$!j-l:4

preg_replace_callback:
<b>5:</b>gjla<hi>#''*&$!j-l:4

htmlentities (w/ ENT_QUOTES):
<b>5:</b>gjla<hi>#''*&$!j-l:4

htmlentities appears to not be encoding those other characters like :
Sorry for the wall of text. Is this just me being paranoid?

EDIT #2: ___________

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

浅黛梨妆こ 2024-12-18 12:46:43

阻止 XSS 攻击所需要做的就是使用 htmlspecialchars()。

回复收藏 0 原文

深海不蓝 2024-12-18 12:46:43

这正是 htmlentities 已经做的事情：

http://codepad.viper-7.com/NDZMa3

它将转换（间隔以防止堆栈溢出双重编码）：
“&#amp;”
到
“&#amp;#amp;”

回复收藏 0 原文

￠蛋碎的人ぎ生 2024-12-18 12:46:43

空格 ' ' 可以在您的正则表达式中更改为 \s，也可以通过在您创建的正则表达式的末尾添加 /i 来实现 不区分大小写，并且您不需要手动将字符转换为序列，可以通过的回调来完成html实体

$clean = preg_replace_callback('/[^a-z0-9\s]/i', 'htmlentities', $userInput);

space ' ' can be changed to \s in your regex, also by adding /i at the end of the regex you made it case insensitive, and you don't need manually translate your chars to sequences, it can be done with a callback of htmlentities

$clean = preg_replace_callback('/[^a-z0-9\s]/i', 'htmlentities', $userInput);

回复收藏 0 原文

~没有更多了~

关于作者

梦在夏天

暂无简介

文章

28 人气

关注发私信

友情链接

文江博客

php - 使用 preg_replace_callback 和 ord() 清理用户输入？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

php - 使用 preg_replace_callback 和 ord() 清理用户输入？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（3）

关于作者

相关话题

热门标签

推荐作者

alipaysp_snBf0MSZIv

梦断已成空

瞎闹

凯凯我们等你回来

寄意

似梦非梦

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。