我需要在 HTML 和 PHP 中声明内容类型/字符集吗?
如果内容类型和字符集在 PHP 标头中声明,是否有理由在通常的 HTML DTD 中再次声明它们?
// 这里 <头> // 和这里 ...
If the content type and character set are declared in the PHP header, is there a reason to have them again in the usual HTML DTD?
<?php ob_start( 'ob_gzhandler' ); header('Content-type: text/html; charset=utf-8'); ?> // here <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> // and here ...
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果您在标头中发送字符集,则无需在 HTML 标记中重复它。
最好将此信息发送到一个地方(DRY 原则),如果字符集冲突(即带有 UTF-8 的标头和带有 iso-8859-1 的元),浏览器可能会进入怪异模式。
话虽如此,一些自动化工具(网络抓取工具)可能不会查看标头并仅通过
meta
标记推断页面编码。保持每个页面的标题和元标记相同非常重要 - 混合不同的字符集可能会混淆浏览器并导致显示问题。
If you are sending the charset in the headers, the is no need to repeat it in the HTML markup.
It is better to send this information in one place (DRY principle), as if the charsets conflict (ie. a header with UTF-8 and a meta with iso-8859-1), the browser will probably go to quirks mode.
Having said that, some automated tools (web scrapers) may not look at the header and deduce the page encoding only by the
meta
tag.It is important to keep both the header and meta tag the same for each page - mixing different charsets may confuse browsers and cause display issues.
如果有人决定保存页面或网络抓取工具,在 HTML 源代码中包含字符集可能会有所帮助:)。 libxml 查找元标记以确定解析标记时要使用的字符集。向您的开发人员同事展示一些对网络抓取的热爱。
Having the charset in the HTML source may be helpful if someone decides to save a page, or for web scrapers :). libxml looks up the meta tag to determine the charset to use when parsing the markup. Show your fellow developers some web scraping love.
如果您在 HTTP 标头中声明它,那么它将在代理转码后继续存在,并且不会在浏览器中触发“哎呀,我猜到了错误的编码,从顶部重新开始解析”的情况。
如果您在文档正文中声明它,那么它将在 HTTP(或具有内容类型标头的另一个系统,例如电子邮件)之外进行访问。
如果您在两者中都声明它,那么只要不发生转码,您就可以两全其美。
请注意,如果您不使用 UTF-8 或 UTF-16,则 XML 规范要求您在 XML prolog 中指定它(并且使用 XML prolog 将触发 IE6 中的 Quirks 模式)。
If you declare it in the HTTP headers, then it will survive transcoding by proxies and won't ever trigger a "Whoops, I guessed the wrong encoding, restart parsing from top" situation in browsers.
If you declare it in the body of the document then it will survive being access outside of HTTP (or another system with content-type headers, such as email).
If you declare it in both then you get the best of both worlds so long as no transcoding happens.
Note that if you don't use UTF-8 or UTF-16 then the XML spec requires that you specify it in the XML prolog (and that using an XML prolog will trigger Quirks mode in IE6).