假设我的 Web 应用程序呈现以下标记:
<object type="application/x-pdf" data="http://example.com/test%2Ctest.pdf">
<param name="showTableOfContents" value="true" />
<param name="hideThumbnails" value="false" />
</object>
是否应该对 data
属性进行转义(百分比编码路径)?在我的例子中是这样的。我没有找到任何规范。
附录
实际上,我对使用data
属性的浏览器插件应该在那里看到什么的规范感兴趣。例如,Adobe Acrobat 插件同时采用转义和未转义的 uri。但是, QWebPluginFactory 将 data
属性视为人类可读的 URI(未转义),这会导致双百分比编码。我想知道这是否是 QWebPluginFactory 的错误。
Suppose my web application renders the following tag:
<object type="application/x-pdf" data="http://example.com/test%2Ctest.pdf">
<param name="showTableOfContents" value="true" />
<param name="hideThumbnails" value="false" />
</object>
Should data
attribute be escaped (percent-encoded path) or no? In my example it is. I haven't found any specification.
addendum
Actually, I'm interested in specification on what should browser plugins consuming data
attribute expect to see there. For example, Adobe Acrobat plugin takes both escaped and unescaped uri. However, QWebPluginFactory treats data
attribute as a human readable URI (unescaped), and that leads to double percent encoding. And I'm wondering whether it is a bug of QWebPluginFactory
or not.
发布评论
评论(2)
data
属性期望该值为 URI。因此,您应该提供一个语法上有效的 URI 值。当前的 URI 规范是 RFC 3986。要查看 URI 路径中的
,
是否需要编码,请查看 path 生成规则是如何定义的:由于我们有一个包含权限信息的URI,因此我们需要查看path-abempty(请参阅URI 生成规则):
段是零个或多个pchar个字符,其定义如下(我已经扩展了产生规则):
正如您所见,pchar > 扩展为文字
,
。因此,您不需要在 path 组件中对,
进行编码。但是,由于您可以使用百分比编码对任何非分隔字符进行编码而不更改其含义,因此可以使用%2C
而不是,
。The
data
attribute expects the value to be a URI. So you should provide a value that is a syntactically valid URI.The current specification of URIs is RFC 3986. To see whether the
,
in the URI’s path needs to be encoded, take a look at how the path production rule is defined:Since we have a URI with authority information, we need to take a look at path-abempty (see URI production rule):
segment is zero or more pchar characters that is defined as follows (I’ve already expanded the production rules):
And as you can see, pchar expands to a literal
,
. So you don’t need to encode the,
in the path component. But since you are allowed to encode any non-delimiting character using the percent-encoding without changing its meaning, it is fine to use%2C
instead of,
.URL 通常只能包含特定字符。不幸的是,不同的规范包含不同的字符列表,这些字符被认为是保留的,因此无法使用。
在您的示例中,编码字符是逗号(
,
),它是某些规范中的保留字符,因此对其进行编码并没有错。大多数网络服务器应该平等地处理未编码和编码的逗号,但是有些网络服务器可能不这样做,具体取决于它们的配置。因此,通常最好首先避免文件名中包含特殊字符(如示例中所示)。
当 GET 参数中有特殊字符时,始终需要 URL 编码。例如,支持将
C&A
作为值的 GET 参数必须编写为:编辑:
插件(甚至浏览器)也不关心方式。他们不会尝试(或需要)解码它或类似的东西。他们只是请求从服务器输入的 URL。
URLs generally can only contain specific characters. Unfortunately different specifications contain different lists of characters that are considered reserved and thus can't be used.
In your example the encoded character is a comma (
,
), which is a reserved character in some specifications, so it's not wrong to encode it.Most webservers should handle unencoded and encoded commas equaly, however there can be some that don't, depending on their configuration. Due to that it is generally a good idea to avoid having special characters in filenames (as you have in your example) in the first place.
URL encoding is always needed when you have special characters in GET parameters. For example a GET parameter that is support to take
C&A
as a value has to be written as:EDIT:
Plugins (or even the browser) don't care either way. They don't try to (or need to) decode it or anything like that. They just request the URL as entered from the server.