Zend 解码表单输入元素中的 html 实体导致空值
我有一个名为 metaDescription
的表单元素:
//inside the form
$description = $this -> createElement('text', 'metaDescription')
-> setLabel('Description:')
-> setRequired(false)
-> addFilter('StringTrim')
-> addValidator('StringLength', array(0, 300))
-> addErrorMessage('Invalid description.');
$this->addElement($description);
每当加载此表单时,我都会使用从数据库中提取的默认值对其进行初始化:
$form->setDefault('metaDescription', $oldPage->getMetaDescription());
这工作得很好。
但是,我现在想要在有人发送表单时对任何输入描述进行 htmlencode,并从数据库中提取默认值,以便字符以其原始形状显示再次。
我在处理表单输入时这样做了:
//handle post
if ($request->isPost()) {
if ($form->isValid($request->getPost())) {
$page = new Application_Model_PagePainter(array(
'metaDescription' => htmlentities($form->getValue('metaDescription'))
));
$pageMapper->save($page);
....
现在我像这样设置默认值:
$form->setDefault('metaDescription', html_entity_decode($oldPage->getMetaDescription()));
起初,这似乎也工作得很好。当我发送例如 woord1, woord2, me&you
作为描述时,这会在数据库中正确保存为 woord1, woord2, me&you
并正确显示再次为woord1、woord2、我和你
。但是,当我设置像 ó 这样的奇怪字符时,例如。 wóórd1
这在数据库中正确保存为 wóórd1
但随后发生了一些奇怪的事情:当再次显示表单时,默认值为空。当我查看源代码时,它确实是空的:。
这会让我相信由于某种原因 html_entity_decode($oldPage->getMetaKeywords())
返回一个空字符串。但是,当我回显它时,它返回正确的结果:wóórd1
,但 setDefault 没有效果。当我删除 html_entity_decode
时,setDefault 再次正确工作,并且该值显示在表单中,但没有解码的 html 实体。
为什么这个 html 实体解码会导致表单值对于这种奇怪的字符为空?
回复 vstm
出于调试目的,我像这样取消编码:
$this->view->setEscape(array($this, 'myEscape'));
public function myEscape($inputString)
{
return $inputString;
}
不幸的是,问题仍然与之前解释的相同。只是为了澄清一下,我在将值放入数据库之前对其进行编码,如下所示:
'metaDescription' => htmlentities($form->getValue('metaDescription'), ENT_COMPAT, 'UTF-8')
并且在将值从数据库中取出后对值进行解码,如下所示:
$form->setDefault('metaDescription', html_entity_decode($oldPage->getMetaDescription(), ENT_COMPAT, 'UTF-8'));
但非常有趣的是,它似乎确实与 UTF8 编码相关,因为当我将编码更改为,
'metaDescription' => htmlentities($form->getValue('metaDescription'), ENT_COMPAT 'ISO-8859-1')
同时保持解码为 UTF8,输入 tést
将导致输入框显示 tést
而不是设置两种方法时的空值转为 UTF8。
这对你有帮助吗?
I have a form element, called metaDescription
:
//inside the form
$description = $this -> createElement('text', 'metaDescription')
-> setLabel('Description:')
-> setRequired(false)
-> addFilter('StringTrim')
-> addValidator('StringLength', array(0, 300))
-> addErrorMessage('Invalid description.');
$this->addElement($description);
Whenever this form loads, I initialize it with a default value pulled from the database:
$form->setDefault('metaDescription', $oldPage->getMetaDescription());
This works perfectly fine.
However, I now want to htmlencode
any input description when someone sends the form and html_entity_decode
the default value that is pulled from the database so that the characters are shown in their original shape again.
I did this like so when handling form input:
//handle post
if ($request->isPost()) {
if ($form->isValid($request->getPost())) {
$page = new Application_Model_PagePainter(array(
'metaDescription' => htmlentities($form->getValue('metaDescription'))
));
$pageMapper->save($page);
....
And I now set the default value like so:
$form->setDefault('metaDescription', html_entity_decode($oldPage->getMetaDescription()));
At first, this seems to work fine as well. When I send for example woord1, woord2, me&you
as the description, this is correctly saved as woord1, woord2, me&you
in the database and correctly displayed again as woord1, woord2, me&you
. However, when I set a strange character like ó, eg. wóórd1
this is correctly saved in the database as wóórd1
but then something strange happens: when the form is displayed again, the default value is empty. When I look at the source, it is indeed empty: <input type="text" name="metaDescription" id="metaDescription" value="" />
.
This would make me believe that for some reason html_entity_decode($oldPage->getMetaKeywords())
returns an empty string. However, when I echo it it returns the correct result: wóórd1
, yet the setDefault has no effect. When I remove the html_entity_decode
the setDefault works correct again and the value is shown in the form, but without the decoded html entity.
Why is this html entity decode causing the form value to be empty for such strange characters?
Reply to vstm
For debugging purposes, I unset encoding like so:
$this->view->setEscape(array($this, 'myEscape'));
public function myEscape($inputString)
{
return $inputString;
}
Unfortunately, the problem remains the same as explained earlier. Just to clarify, I encode the value before putting it in the database like so:
'metaDescription' => htmlentities($form->getValue('metaDescription'), ENT_COMPAT, 'UTF-8')
And I decode the value after getting it out of the database like so:
$form->setDefault('metaDescription', html_entity_decode($oldPage->getMetaDescription(), ENT_COMPAT, 'UTF-8'));
Very interestingly however, is that it does seem related to the UTF8 encoding, because when I change the encoding to
'metaDescription' => htmlentities($form->getValue('metaDescription'), ENT_COMPAT 'ISO-8859-1')
while keeping decoding at UTF8, an input tést
will result in the input box showing tést
rather than an empty value which is the case when setting both methods to UTF8.
Does this help you?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我知道这与 Zend 框架做自己的 使用 htmlspecialchars 和 utf-8 转义(除非您使用视图
setEscape
/setEncoding
方法更改它)。事实上,当你这样做时:$test
最后是空的。因此,您必须使用“utf-8”调用 html_entity_decode 或将视图编码更改为“iso-8859-1”(或任何您的编码)。我认为提供“utf-8”是更好的选择。
反对编码的战争
为了完成这项工作,您还必须注意浏览器使用的编码,否则您要么在数据库中写入垃圾,要么在输出中渲染垃圾,或者两者兼而有之(或者什么也不做,如果您将错误的字符集交给某些 PHP 函数) 。 (请耐心等待)
所以首先您必须确保浏览器使用的编码。这可以通过以下方式实现:
因此,请检查 HTML 输出中的内容类型元标记以及它建议的编码。如果没有内容类型元信息或者不包含字符集信息,那么您应该在布局中添加一个,最好使用 utf-8(如果您现在不使用布局,那么是开始使用它的好时机) )。这很重要,否则您不确定您的输入是什么编码或者您必须向浏览器传递什么编码。这意味着您的应用程序返回的每个页面的打开
标签之后会出现类似的情况:
在以下示例中,我们假设您选择 utf-8,但您可以使用任何合适的内容 -如果您相应地更改值(这意味着 s/UTF-8/您的编码/g)。
现在,当从浏览器检索数据时,您知道必须为
htmlentities
调用提供什么字符集 (utf-8):这意味着
$form->getValue('metaDescription' )
返回一个 utf-8 编码的字符串,必须将其转换为 HTML 实体字符串,这正是我们想要的。因此,数据库中现在是无威胁的字符串,没有元音变音、重音符号或其他任何内容。
现在我们来看看编辑部分。您必须在那里解码 HTML 实体,以便用户不能处理它们。输出字符串必须使用我们想要的字符集进行编码(是的,正确的:utf-8):
所以现在您已经将
html_entity_decode
返回的 utf-8 编码字符串分配给metaDescription
现在我们只需要通过htmlspecialchars
调用,如果有人使用$view->escape()
,默认情况下会调用该调用。最后一步是确保
Zend_View
的encode
知道我们的编码(如果您使用 utf-8,这是可选的,因为这已经是默认值)。可以使用$this->view->setEncoding('UTF-8')
为控制器中的特定视图设置它,也可以为bootstrap.php
中的所有视图设置它>:如果现在有人调用
$view->escape()
,它也需要一个 utf-8 字符串作为输入。您应该能够使用“null”转义删除setEscape
调用。如果您遵循了所有这些步骤,您现在应该已经根据需要恢复了所有带有元音变音、重音符号和坟墓的特殊字符(否则我现在已经丢脸了)。
因此每个函数都会收到它期望的编码,否则它会返回臭名昭著的空字符串(伪流程图):
htmlentities($browserData, ,'UTF-8')
->期望 UTF-8 返回没有元音变音或其他奇特内容的html_entity_decode($dbData, ,'UTF-8')
->需要 ASCII,返回 UTF-8 编码$view->escape()
:htmlspecialchars
->需要 UTF-8,返回 UTF-8tl;dr / recap
I knew it hat something to do with the Zend framework doing its own escaping using htmlspecialchars and utf-8 (unless you change that with the view
setEscape
/setEncoding
methods). And indeed when you do this:$test
is empty at the end.So you have to call html_entity_decode with "utf-8" or change the views encoding to "iso-8859-1" (or whatever your encoding is). I think supplying "utf-8" is the better option.
War against the encodings
To make this work you have also take care of what encoding the browser is using because otherwise you either write garbage in your database, render garbage in your output or both (or nothing, if you hand over the wrong charset to certain PHP-functions). (bear with me)
So first you have to ensure what encoding the browser is using. This can be achieved by:
So check out the content-type meta tag in your HTML-output and what encoding it is suggesting. If there is no content-type meta information or it doesn't include the charset information then you should add one, preferably with utf-8, in your layout (if you're not using layout now is a good time to start with it). This is important otherwise you don't know for sure what encoding your input is or what encoding you have to deliver to the browser. That means something like that is after your opening
<head>
-Tag of every page returned by your application:In the following examples we assume you choose utf-8, but you might use whatever is appropriate - if you change the values accordingly (that means s/UTF-8/your encoding/g).
Now, when retrieving data from the browser you know what charset you have to supply for the
htmlentities
call (utf-8):So that means that
$form->getValue('metaDescription')
returns an utf-8 encoded string which has to be converted to an HTML-entities string, which is exactly what we want.So in the database is now the non-threatening string with no umlauts, accents or whatever.
Now we take a look at the editing-part. There you must decode the HTML-entities so the user must not deal with them. The output string has to be encoded with our desired charset (yes, right: utf-8):
So now you have assigned the utf-8 encoded string returned by
html_entity_decode
tometaDescription
now we only have to get past thathtmlspecialchars
call which is called by default if someone uses$view->escape()
.The last step is to ensure that the
Zend_View
'sencode
is aware of our encoding (this is optional if you are using utf-8 since this is already the default). Either set it for a specific view in the controller with$this->view->setEncoding('UTF-8')
or for all views in thebootstrap.php
:If someone now calls
$view->escape()
it also expects an utf-8 string as input. You should be able to remove thesetEscape
call with the "null" escape.If you followed all these steps you should now have all special characters with umlauts, accents and graves restored as desired (or I have now disgraced myself).
So every function receives the encoding it expects, otherwise it returns the infamous empty string (pseudo flow-chart):
htmlentities($browserData, ,'UTF-8')
-> expects UTF-8 returns ASCII without umlauts or other fancy stuffhtml_entity_decode($dbData, ,'UTF-8')
-> expects ASCII, returns UTF-8 encoded$view->escape()
:htmlspecialchars
-> expects UTF-8, returns UTF-8tl;dr / recap
您还可以使用 Zend_Filter_HtmlEntities() 代替 php 函数。它所做的并不比 php 函数多,但它将保证整个表单的持久编码。
You can also use Zend_Filter_HtmlEntities() instead of the php functions. It is not doing more than the php functions but it will guarantee a persistent encoding throughout your form.