Perl CGI 脚本默认使用什么内容编码?

发布于 2024-07-07 01:32:39 字数 539 浏览 14 评论 0原文

我正在修改一个用 Perl 编写的成熟 CGI 应用程序,并且出现了内容编码的问题。 浏览器报告内容采用 iso-8859-1 编码,并且应用程序将 iso-8859-1 声明为 HTTP 标头中的字符集,但似乎从未实际进行编码。 perldoc 教程中描述的各种编码技术都没有(编码编码, Open)在代码中使用,所以我对文档的实际编码方式有点困惑。

如前所述,该应用程序相当成熟,并且可能早于许多当前的编码方法。 有谁知道我应该寻找任何遗留或已弃用的技术? 当开发人员没有提供指示时,Perl 假定/默认使用什么编码?

谢谢

I'm modifying a mature CGI application written in Perl and the question of content encoding has come up. The browser reports that the content is iso-8859-1 encoded and the application is declaring iso-8859-1 as the charset in the HTTP headers but doesn't ever seem to actually do the encoding. None of the various encoding techniques described in the perldoc tutorials (Encode, Encoding, Open) are used in the code so I'm a little confused as to how the document is actually being encoded.

As mentioned, the application is quite mature and likely predates many of the current encoding methods. Does anyone know of any legacy or deprecated techniques I should be looking for? To what encoding does Perl assume/default to when no direction is provided by the developer?

Thanks

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

深爱成瘾 2024-07-14 01:32:39

Perl 不会做出任何假设,但浏览器会假设编码通常基于猜测。 如果不使用任何编码技术,则文档将直接输出,就像它们被写入一样。

您可以在 HTTP Content-Type 标头中指定字符集。

Perl will not assume anything, but the browser is assuming that encoding based usually on guesswork. The documents are output directly, just as they were written, if none of the encoding techniques is used.

You can specify the charset in the HTTP Content-Type header.

A君 2024-07-14 01:32:39

我首先要查看的是服务器配置。 如果您没有在程序中设置内容编码标头,您可能会得到服务器的猜测。

与服务器分开运行脚本以查看其实际输出是什么。 当服务器从 CGI 程序(不是 nph)获取输出时,服务器会在将其发送到客户端之前修复它认为丢失的任何内容的标头。

The first place I'd look is the server configuration. If you aren't setting the content-encoding header in the program, you're likely picking up the server's guess.

Run the script separate from the server to see what its actual output is. When the server gets the output from a CGI program (that's not nph), the server fixes up the header for anything it thinks is missing before it sends it to the client.

梦巷 2024-07-14 01:32:39

默认情况下,Perl 将字符串处理为字节序列,因此如果您从文件中读取并将其打印到 STDOUT,它将产生相同的字节序列。 如果您的模板是 Latin-1,您的输出也将是 Latin-1。

如果您在文本字符串上下文中使用字符串(例如 uclc 等),perl 会采用 Latin-1 语义,除非该字符串之前已被解码。

有关 Perl、字符集和编码的更多信息

By default Perl handles strings as being byte sequences, so if you read from a file, and print that to STDOUT, it will produce the same byte sequence. If your templates are Latin-1, your output will also be Latin-1.

If you use a string in text string context (like with uc, lc and so on) perl assumes Latin-1 semantics, unless the string has been decoded before.

More on Perl, charsets and encodings

不再见 2024-07-14 01:32:39

如果浏览器将内容报告为 iso-8859-1,也许您的 Perl 脚本没有输出正确的标头来指定字符集?

If the browser reports the content as iso-8859-1, maybe your perl script didn't output the correct headers to specify the charset?

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文