缩小 HTML 之前我应该​​考虑什么?

发布于 2024-07-12 23:44:01 字数 124 浏览 5 评论 0原文

我用 google 搜索了一下,但找不到任何 HTML 缩小脚本。

我突然想到,也许 HTML 缩小除了删除所有不需要的空白之外就没有什么了。

我是否遗漏了什么或者我的 Google Fu 丢失了?

I've googled around but can't find any HTML minification scripts.

It occurred to me that maybe there is nothing more to HTML minification than removing all unneeded whitespace.

Am I missing something or has my Google Fu been lost?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(12

若沐 2024-07-19 23:44:01

从 HTML 中删除内容时必须小心,因为它是一种脆弱的语言。 根据页面的编码方式,某些空白可能会更重要; 另外,如果您有 CSS 样式,例如 white-space: pre 那么您可能需要保留空格。 另外,还有许多浏览器错误等,基本上 HTML 文件中的每个字符都可能是为了满足某些要求或安抚某些浏览器。

在我看来,最好的选择是使用 CSS 技术很好地设计页面(我最近能够在我工作的网站上获取一个重要页面,并通过使用 CSS 而不是表格和嵌套样式对其进行重新编码,将其大小减少 50% ="..." 属性)。 然后,使用 GZip 来减小支持 gzip 的浏览器的页面大小。 这将节省带宽,同时保留 html 的结构。

You have to be careful when removing stuff from HTML as it's a fragile language. Depending on how your pages are coded some of that whitespace might be more significant; also if you have CSS styles such as white-space: pre then you may need to keep the whitespace. Plus there are numerous browser bugs, etc, and basically every character in an HTML file might be there to satisfy some requirement or appease some browser.

In my opinion your best bet is to design the pages well with CSS techniques (I was recently able to take an important page on the site I work for and reduce it's size by 50% just by recoding it using CSS instead of tables and nested style="..." attributes). Then, use GZip to reduce the size of your pages for browsers that understand gzip. This will save bandwidth while preserving the structure of the html.

辞别 2024-07-19 23:44:01

有时,根据封闭标签和/或 CSS,空白可能很重要。

Sometimes, depending on the enclosing tags and/or on the CSS, whitespace may be significant.

故事灯 2024-07-19 23:44:01

正如其他答案提到的那样,除了 HTML Tidy/删除空白之外,没有太多。

这更多的是一个手动任务,将样式属性提取到 CSS 中(希望您没有使用 FONT 标签等),尽可能使用更少的标签和属性(例如不在元素中嵌入 标签,而是使用 CSS使整个元素 font-weight: 粗体,除非使用 >strong<) 等具有语义意义。

Outside of HTML Tidy/removing white space as the other answers mentioned, there isn't much.

This is more of a manual task pulling out style attributes into CSS (hopefully you're not using FONT tags, etc.), using fewer tags and attributes where possible (like not embedding <strong> tags in an element but using CSS to make the whole element font-weight: bold, unless of course it makes semantic sense to use >strong<), etc.

梦醒灬来后我 2024-07-19 23:44:01

是的,我想这几乎消除了空格和注释。 您不能像 javascript 那样用较短的标识符替换标识符,因为 CSS 类或 javascript 很可能依赖于这些标识符。

另外,删除空格时应小心,并确保始终至少留有空格字符,否则所有文本都会像这样。

Yes I guess it's pretty much removing whitespace and comments. You cannot replace identifiers with shorter ones like in javascript, since chances are that CSS classes or javascript will depend on those identifiers.

Also, you should be careful when removing whitespace and make sure that there is always at least whitespace character left, otherwise allyourtextwilllooklikethis.

沙沙粒小 2024-07-19 23:44:01

关于此主题,此 WordPress 博客上有相当长的讨论。 您可以在那里找到使用 PHP 和 HTML Tidy 的非常冗长的建议解决方案。

There's a pretty lengthy discussion on this Wordpress blog about this topic. You can find a very lengthy proposed solution using PHP and HTML Tidy there.

椒妓 2024-07-19 23:44:01

您可以在此处找到一些很好的参考资料,例如 HTML tidy 等。

如果您不想使用这些选项之一,Prototype 有一种方法清理空白 在 DOM 中。 您可以自己完成此操作,并通过 Firefox 扩展 Web 开发人员工具栏中的“查看生成的源代码”复制它。 然后你可以用原型的修复替换原来的html。 抱歉没有做出那个明显的缺口。

(我推荐第一个链接)

You can find some good references here to things like HTML tidy and others.

If you don't want to use one of those options, Prototype has a means to clean the whitespace in the DOM. You could do that on your own and copy it via 'View Generated Source' in the Firefox extension Web Developer Toolbar. Then you can replace the original html with prototype's fix. Sorry for not making that apparent nickf.

(I recommend the first link)

百变从容 2024-07-19 23:44:01

我还没有尝试过,但是 htmlcompressor 是一个 HTML 压缩器,如果您愿意的话一试。

I haven’t tried it yet, but htmlcompressor is an HTML minifier, if you fancy giving one a try.

流殇 2024-07-19 23:44:01

我已经使用这个正则表达式很多年了,没有任何问题: s/>\s*

In Python re.sub(r'>\ s*<', '><', html)

或者在 PHP 中 preg_replace('/>\s*<', $html);

这删除了标签之间的所有空格,但不是任何地方,这是相当安全的(但并不完美,在某些情况下这会中断,但很少见)。

我这样做的主要原因不是速度/文件大小,而是因为空白通常会引入一个空格。 这没什么问题,但是当您开始使用 Javascript 在 DOM 中进行修改时,空格经常会丢失,从而产生(较小的)布局差异。

考虑一下:

<div>
    <a>link1</a>
    <a>link2</a>
</div>

链接之间有一个空格,但现在我做了类似的事情:

$('div').append('<a>link3</a>')

而且没有空格......我需要在我的JS中手动添加空格,这相当难看& 恕我直言,容易出错。

I've used this regexp for years, without any problems: s/>\s*</></g

In Python re.sub(r'>\s*<', '><', html)

Or in PHP preg_replace('/>\s*</', '><', $html);

This removed all whitespace between tags, but not anywhere, this is fairly safe (but not perfect, there are situations where this will break, but they're rare).

My main reason for doing this isn't speed/file size, but because the whitespace often introduces a, well, space. This would be okay, but when you start mucking about in your DOM with Javascript, spaces are often lost, creating (minor) layout differences.

Consider:

<div>
    <a>link1</a>
    <a>link2</a>
</div>

There's a space between the links, but now I do something like:

$('div').append('<a>link3</a>')

And there's no space ... I need to manually add the space in my JS, which is fairly ugly & error-prone IMHO.

时光是把杀猪刀 2024-07-19 23:44:01

如果你已经安装了node.js并且你是Windows用户,你可以创建这个.bat
它将缩小 min 子文件夹中文件夹中的所有 html。

输出将在 min 文件夹中

  1. 打开控制台。 运行--> npm install html-minifier -g
  2. 创建 .bat。 不要忘记更改 cd 命令中的路径。 更改bat 文件中的文件夹比复制和粘贴更容易。
  3. 进入控制台进入 .bat 文件夹并运行它。

cd the_destination_folder

dir  /b *.HTML > list1.txt

for /f "tokens=*" %%A in (list1.txt) do html-minifier --collapse-whitespace --remove-comments --remove-optional-tags %%~nxA  -o min\%%~nxA 

pause

If you have installed node.js and you are a windows user you can create this .bat
It will minify all html in your folder in the min subfolder.

The output will be in min folder

  1. open the console. run--> npm install html-minifier -g
  2. create the .bat. don't forget to change the route in cd command. It's easier to change the folder in the bat file than copy and paste.
  3. go in console into the .bat folder and run it.

cd the_destination_folder

dir  /b *.HTML > list1.txt

for /f "tokens=*" %%A in (list1.txt) do html-minifier --collapse-whitespace --remove-comments --remove-optional-tags %%~nxA  -o min\%%~nxA 

pause
时光暖心i 2024-07-19 23:44:01

JavaScript 不能用作压缩 HTML 字符串的解压缩器,例如,为未压缩格式提供 DEV 构建,运行“发布”脚本以将 DEV 构建压缩到生产环境,并将 JavaScript 附加到 HTML 源(使用像以前一样删除空格等)?

服务器上的带宽会减少,但缺点是将字符串解压缩为 HTML 时客户端会承受更多压力。 此外,还需要启用 JavaScript 并能够将解压后的字符串解析为 HTML。

我并不是说它是一个明确的解决方案,而是一些可能有效的解决方案 - 这完全取决于您是否在没有用户 JavaScript 权限/系统规范等的情况下考虑带宽。

否则寻找混淆脚本,一个简单的谷歌搜索产生http://tinyurl.com/phpob - 取决于什么您正在寻找的应该有一个可用的软件包。

如果我说错了,请大声喊出来,我会看看还能做什么。

祝你好运!

Couldn't JavaScript be used as a decompresser for a compressed HTML string, for instance have a DEV build for the uncompressed format, run a 'publish' script to compress the DEV build to production and attach a JavaScript to the HTML source (with the whitespace and such removed as before)?

The bandwidth would be reduced on the server, but the downside is there is a lot more client strain for decompressing the string to HTML. Also JavaScript would need to be enabled and be able to parse the decompressed string to HTML.

I am not saying its a definite solution, but something that might work - it all depends on if your looking in regards to bandwidth without the users JavaScript permissions/systems spec, or such.

Otherwise look for obfuscation scripts, a simple google search produced http://tinyurl.com/phpob - dependent on what your looking for there should be a software package available.

If I am on the wrong lines, please shout and I will see what else I can do.

Good Luck!

陪你到最终 2024-07-19 23:44:01

我最近发现了一个基于 PHP 的脚本,可以动态缩小您的网站 HTML - 内联 css - 内联 javascript,它被称为
动态网站压缩器

I recently found a PHP based script that minify your sites HTML - Inline css - Inline javascript on the fly it is called as
Dynamic website compressor

蔚蓝源自深海 2024-07-19 23:44:01

这是一个用 PHP 编写的 HTML5 压缩器。

<?PHP
$in=file_get_contents('path/to/source.html');

//Strips spaces if there are more than one.
$in=preg_replace('/\s{2,}/m',' ',$in);
//trim
$in=preg_replace('/^\s+|\s+$/m','',$in);
/*Strips spaces between tags. 
Use (  or ­ or better) padding or margin if necessary, otherwise the html
parser appends a one space textnode.*/  
$in=preg_replace('/ ?> < ?/','><',$in);
//Removes tag end slash.
$in=preg_replace('@ ?/>@','>',$in);
//Removes HTML comments except conditional IE comments.
$in=preg_replace('/<!--[^\[]*?-->/','',$in);
//Removes quotes where possible.
$in=preg_replace('/="([^ \'"\=><]+)"/','=$1',$in);
$in=preg_replace("/='([^ '\"\=><]+)'/",'=$1',$in);

file_put_contents('path/to/min.html',$in);
?>

之后你就有了一行更短的 html 代码。

最好从正则表达式创建一个数组,但要注意转义反斜杠。

Here is a minifier for HTML5 written in PHP.

<?PHP
$in=file_get_contents('path/to/source.html');

//Strips spaces if there are more than one.
$in=preg_replace('/\s{2,}/m',' ',$in);
//trim
$in=preg_replace('/^\s+|\s+$/m','',$in);
/*Strips spaces between tags. 
Use (  or ­ or better) padding or margin if necessary, otherwise the html
parser appends a one space textnode.*/  
$in=preg_replace('/ ?> < ?/','><',$in);
//Removes tag end slash.
$in=preg_replace('@ ?/>@','>',$in);
//Removes HTML comments except conditional IE comments.
$in=preg_replace('/<!--[^\[]*?-->/','',$in);
//Removes quotes where possible.
$in=preg_replace('/="([^ \'"\=><]+)"/','=$1',$in);
$in=preg_replace("/='([^ '\"\=><]+)'/",'=$1',$in);

file_put_contents('path/to/min.html',$in);
?>

After that you have a one line, shorter html code.

Better you make an array from the regular expressions, but aware to escape the back slashes.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文