优化和压缩 HTML
我有一些手工制作的网页。部署它们时,我想通过一个工具运行它们,以便创建新的较小的 HTML 文件,并删除无关的空格等。
我们已经在 Javascript 和 CSS 中使用 YUICompressor,并且我们倾向于遵循所描述的所有技术由雅虎性能团队提供。
有没有一个好的免费工具可以做到这一点?我更喜欢适合我们的部署过程的工具,类似于 YUICompressor。
I have a few hand-crafted web pages. When deploying them I would like to run them through a tool so that new smaller HTML files are created, with extraneous whitespace taken out, etc.
We already use YUICompressor for our Javascript and our CSS, and we tend to follow all of the techniques described by the Yahoo performance team.
Is there a good, free tool that does this? I prefer tools that would fit into our deployment process similarly to YUICompressor.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
HTML Tidy 可以完成这项工作。
我在生成的一份文档(一份相当大的文档)上使用以下内容。这为我节省了大约 10% 的 gzip 压缩后大小。
-c
— 替换多余的表示标签和属性-omit
— 删除可选的结束标签-ashtml
— 使用 HTML 而不是 XHTML(HTML 更简洁, XHTML 对大多数用例没有任何好处)-utf8
- 因此我们不必对字符集之外的字符使用实体(实体更多字节)--doctype strict
— 使用严格(再次精简)--drop-proprietary-attributes yes
— 摆脱专有垃圾--output-bom no
— BOM 在某些客户端中导致问题--wrap 0
— 有很长的行HTML Tidy does the job.
I use the following on one document that I generate (a rather large one). This saved me about 10% on the post-gzip size.
-c
— Replace surplus presentational tags and attributes-omit
— Drop optional end tags-ashtml
— use HTML rather than XHTML (HTML is leaner and XHTML provides no benefits for most use cases)-utf8
— So we don't have to use entities for characters outside the character set (entities are more bytes)--doctype strict
— use Strict (again, leaner)--drop-proprietary-attributes yes
— get rid of proprietary junk--output-bom no
— BOMs cause issues in some clients--wrap 0
— Have very long lines如果您愿意,普通的旧 minify 也会为您攻击您的 HTML。
但一般来说,HTML 缩小并不是非常有效:
将空白行数减少到 1 行并没有多大作用。如果您已经在使用 gzip/deflate,那么这将非常有效地压缩空白。您无法删除所有空白,因为单个空白通常会对渲染产生影响,而这是需要保留的。
删除评论可能会产生效果,具体取决于您实际拥有的评论内容量。但您必须小心,不要点击条件评论。
除此之外,HTML 文档中没有太多可以“缩小”的内容。显然,将变量名打包为尽可能短的字符串的 JS 想法是不适用的。
像大多数缩小器一样,使用正则表达式完成所有这些工作有点狡猾。您必须坚持有限的“正常”标记范围,这样才不会出错。
使用 HTML 缩小,您通常会获得比 JS/CSS 缩小更少的增益(以及更少的后 gzip 增益),而对于动态生成的页面,您会产生更多开销(因为您无法像使用静态脚本/样式那样预先缩小它们) )。一些模板语言可能已经具有在生成时修剪空白的内置功能;如果您的环境中可用,请使用它。
Plain old minify will also attack your HTML for you, if you want.
But HTML minification isn't, generally, hugely effective:
Taking runs of whitespace down to one won't do that much. If you're already using gzip/deflate, that'll be compressing the whitespace quite efficiently. You can't remove all whitespace as single whitespaces can often have an effect on rendering that it is desirable to keep.
Taking comments out may have an effect, depending on how much comment content you actually have. But you'd have to be careful not to hit conditional comments.
Apart from that, there is not much in an HTML document that can be ‘minified’. Obviously the JS idea of packing variable names down to the shortest possible string is inapplicable.
Doing all this with regex, as most minifiers do, is a bit dodgy. You have to stick to a limited ‘normal’ range of markup that won't trip it up.
With HTML minification you're typically getting less gain (and less post-gzip gain) than JS/CSS minification, and for dynamically-generated pages you have more overhead (as you can't pre-minify them like with static scripts/styles). Some templating languages may already have built-in features for trimming whitespace at generation time; if available in your environment, use that.