当输入中允许 html 实体时,如何防止它们的双重编码
如何防止 html 实体的双重编码,或以编程方式修复它们?
我正在使用 HTML::Entities perl 模块中的encode() 函数进行编码用户输入中的 HTML 实体。这里的问题是,我们还允许用户直接输入 HTML 实体,而这些实体最终会被双重编码。
例如,用户可以输入:
Stackoverflow & Perl = Awesome…
这最终被编码为
Stackoverflow & Perl = Awesome…
这在浏览器中呈现为
Stackoverflow & Perl = Awesome…
我们希望它呈现为
Stackoverflow & Perl = Awesome...
有没有办法防止这种双重编码?或者是否有一个模块或代码片段可以轻松纠正这些双重编码问题?
非常感谢任何帮助!
How can I prevent double encoding of html entities, or fix them programmatically?
I am using the encode() function from the HTML::Entities perl module to encode HTML entities in user input. The problem here is that we also allow users to input HTML entities directly and these entities end up being double encoded.
For example, a user may enter:
Stackoverflow & Perl = Awesome…
This ends up being encoded to
Stackoverflow & Perl = Awesome…
This renders in the browser as
Stackoverflow & Perl = Awesome…
We want this to render as
Stackoverflow & Perl = Awesome...
Is there a way to prevent this double encoding? Or is there a module or snippet of code that can easily correct these double encoding issues?
Any help is greatly appreciated!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您可以先解码字符串:
You can decode the string first:
有一个非常简单的方法可以避免这种情况:
There is an extremely simple way to avoid this:
请考虑保存对
encode()
的调用,直到检索要显示的值为止,而不是在存储它之前。只要您的检索机制保持一致,数据库中的额外数据可能就不值得担心。编辑
重新阅读您的问题,我现在意识到我的答案并没有完全解决问题,因为稍后调用
encode()
仍会得到相同的结果。我自己不知道替代方案,它可能没有太大帮助,但您可能需要考虑寻找一种更合适的编码方法来尊重现有符号。Consider saving the call to
encode()
until you retrieve the value for display, rather than before you store it. So long as you are consistent in your retrieval mechanism, the extra data in your database probably isn't worth fretting over.Edit
Re-reading your question I realize now my answer doesn't fully address the issue seeing as calling
encode()
later will still have the same results. Not knowing of an alternative myself, it may not be much help, but you may want to consider finding a more suitable method for encoding that will respect existing symbols.