输入元素的名称属性允许使用字符吗?
我有一个 PHP 脚本,它将动态生成 ,所以我想知道是否需要过滤
name
属性中的任何字符。
我知道名称必须以字母开头,但是我不知道任何其他规则。我认为必须允许使用方括号,因为 PHP 使用方括号从表单数据创建数组。括号怎么办?空格?
I have a PHP script that will generate <input>
s dynamically, so I was wondering if I needed to filter any characters in the name
attribute.
I know that the name has to start with a letter, but I don't know any other rules. I figure square brackets must be allowed, since PHP uses these to create arrays from form data. How about parentheses? Spaces?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
请注意,并非所有字符都会提交给表单字段的
name
属性(即使使用 POST 时)!空白字符被修剪,内部空白字符以及字符
.
被替换为_
。(在 Chrome 23、Firefox 13 和 Internet Explorer 9 中测试,均为 Win7。)
Note, that not all characters are submitted for
name
attributes of form fields (even when using POST)!White-space characters are trimmed and inner white-space characters as well the character
.
are replaced by_
.(Tested in Chrome 23, Firefox 13 and Internet Explorer 9, all Win7.)
您可以在 [X]HTML 文件中包含的任何字符都可以放入
中。正如 Allain 的评论所说,
被定义为包含
CDATA
,因此您唯一不能放入的内容是控制代码和无效代码点底层标准(SGML 或 XML)不允许。Allin 引用了 HTML4 规范中的 W3:
然而,这在实践中并非如此。
理论上,
application/x-www-form-urlencoded
数据没有指定表单名称或值的编码的机制,因此在其中任何一个中使用非 ASCII 字符都是“未指定” ” 正常工作,您应该使用 POSTedmultipart/form-data
代替。不幸的是,在现实世界中,没有浏览器在
multipart/form-data
POST 请求正文的子部分标头中指定字段的编码,即使理论上可以。 (我相信 Mozilla 尝试过实现它一次,但由于它破坏了服务器而退出。)并且没有浏览器实现令人惊讶的复杂和丑陋的 RFC2231 标准,需要将编码的非 ASCII 字段名称插入多部分的子部分标头中。无论如何,定义
multipart/form-data
的 HTML 规范并没有直接说明应该使用 RFC2231,而且,如果您尝试的话,它会再次破坏服务器。因此,实际情况是,无论表单是什么类型,都无法知道表单提交中的名称和值使用什么编码。对于 GET 和两种类型的 POST 表单,浏览器对包含非 ASCII 字符的字段名称和值的处理方式是相同的:它使用包含所用表单的页面的编码对它们进行编码。非 ASCII GET 表单名称并不比其他名称更糟糕。
DLH:
实际上,
name
属性不是CDATA
的唯一元素是。有关
的所有不同用途,请参阅 HTML4 规范的属性列表名称
;它是一个重载的属性名称,在不同的元素上有许多不同的含义。这通常被认为是一件坏事。但是,现在通常您会避免使用
name
,除了表单字段(其中它是控件名称)和param
(其中它是特定于插件的参数标识符)。这只有两个需要理解的含义。应避免使用老式的name
来标识页面上的元素,例如Any character you can include in an [X]HTML file is fine to put in an
<input name>
. As Allain's comment says,<input name>
is defined as containingCDATA
, so the only things you can't put in there are the control codes and invalid codepoints that the underlying standard (SGML or XML) disallows.Allain quoted W3 from the HTML4 spec:
However this isn't really true in practice.
The theory is that
application/x-www-form-urlencoded
data doesn't have a mechanism to specify an encoding for the form's names or values, so using non-ASCII characters in either is “not specified” as working and you should use POSTedmultipart/form-data
instead.Unfortunately, in the real world, no browser specifies an encoding for fields even when it theoretically could, in the subpart headers of a
multipart/form-data
POST request body. (I believe Mozilla tried to implement it once, but backed out as it broke servers.)And no browser implements the astonishingly complex and ugly RFC2231 standard that would be necessary to insert encoded non-ASCII field names into the multipart's subpart headers. In any case, the HTML spec that defines
multipart/form-data
doesn't directly say that RFC2231 should be used, and, again, it would break servers if you tried.So the reality of the situation is there is no way to know what encoding is being used for the names and values in a form submission, no matter what type of form it is. What browsers will do with field names and values that contain non-ASCII characters is the same for GET and both types of POST form: it encodes them using the encoding the page containing the form used. Non-ASCII GET form names are no more broken than everything else.
DLH:
Actually the only element whose
name
attribute is notCDATA
is<meta>
. See the HTML4 spec's attribute list for all the different uses ofname
; it's an overloaded attribute name, having many different meanings on the different elements. This is generally considered a bad thing.However, typically these days you would avoid
name
except on form fields (where it's a control name) andparam
(where it's a plugin-specific parameter identifier). That's only two meanings to grapple with. The old-school use ofname
for identifying elements like<form>
or<a>
on the page should be avoided (useid
instead).对表单控件名称中可以出现的字符的唯一真正限制是使用 GET 提交表单时
““get”方法将表单数据集值限制为 ASCII 字符。 参考
有一个很好的线程此处。
The only real restriction on what characters can appear in form control names is when a form is submitted with GET
"The "get" method restricts form data set values to ASCII characters." reference
There's a good thread on it here.
虽然阿兰的评论确实回答了OP的直接问题,并且博宾斯提供了一些精彩的深入信息,但我相信很多人来这里寻求更具体问题的答案:“我可以在表单的输入名称属性中使用点字符吗?”
当我搜索这些知识时,这个帖子是第一个结果,我想我也可以分享我发现的内容。
首先,马蒂亚斯声称:
这是不正确的。我不知道浏览器是否真的在 2013 年执行过这种操作 - 但我对此表示怀疑。浏览器按原样发送点字符(谈论 POST 数据)!您可以在任何合适的浏览器的开发人员工具中检查它。
请注意 abluejelly 的那条小评论,可能很多人都错过了:
我用 Apache HTTP 服务器(v2.4.25)检查过,确实像“foo.bar”这样的输入名称更改为“foo_bar”。但在像“foo[foo.bar]”这样的名称中,该点不会被 _ 替换!
我的结论:您可以使用点,但我不会使用它,因为这可能会导致一些意外行为,具体取决于所使用的 HTTP 服务器。
While Allain's comment did answer OP's direct question and bobince provided some brilliant in-depth information, I believe many people come here seeking answer to more specific question: "Can I use a dot character in form's input name attribute?"
As this thread came up as first result when I searched for this knowledge I guessed I may as well share what I found.
Firstly, Matthias' claimed that:
This is untrue. I don't know if browser's actually did this kind of operation back in 2013 - though, I doubt that. Browsers send dot characters as they are(talking about POST data)! You can check it in developer tools of any decent browser.
Please, notice that tiny little comment by abluejelly, that probably is missed by many:
I checked it with Apache HTTP server(v2.4.25) and indeed input name like "foo.bar" is changed to "foo_bar". But in a name like "foo[foo.bar]" that dot is not replaced by _!
My conclusion: You can use dots but I wouldn't use it as this may lead to some unexpected behaviours depending on HTTP server used.
您是指 HTML 输入标记的 id 和 name 属性吗?
如果是这样,我很想将允许的“输入”名称字符限制(或转换)为仅 az (AZ)、0-9 和有限范围的标点符号(“.”、“、”等),如果只是为了限制 XSS 漏洞等的可能性。
此外,为什么要让用户控制输入标签的任何方面? (从验证的角度来看,保留输入标签名称为“custom_1”、“custom_2”等,然后根据需要映射这些名称,最终可能不会更容易。)
Do you mean the id and name attributes of the HTML input tag?
If so, I'd be very tempted to restrict (or convert) allowed "input" name characters into only a-z (A-Z), 0-9 and a limited range of punctuation (".", ",", etc.), if only to limit the potential for XSS exploits, etc.
Additionally, why let the user control any aspect of the input tag? (Might it not ultimately be easier from a validation perspective to keep the input tag names are 'custom_1', 'custom_2', etc. and then map these as required.)