如何删除 从文件的开头?
我有一个 CSS 文件,当我使用 gedit 打开它时,它看起来很好,但是当它被读取时PHP(将所有 CSS 文件合并为一个),此 CSS 前面添加了以下字符: 
PHP 会删除所有空格,因此随机 代码中间搞乱了整个事情。正如我提到的,当我在 gedit 中打开文件时,我实际上看不到这些字符,所以我不能很容易地删除它们。
我用谷歌搜索了这个问题,文件编码显然有问题,这是有道理的,因为我一直通过 ftp 和 rsync,具有一系列文本编辑器。虽然我对字符编码不太了解,所以我们将不胜感激。
如果有帮助,该文件将以 UTF-8 格式保存,并且 gedit 不允许我以 ISO-8859-15 格式保存它(文档包含一个或多个无法使用指定字符编码进行编码的字符)。我尝试使用 Windows 和 Linux 行结尾来保存它,但都没有帮助。
I have a CSS file that looks fine when I open it using gedit, but when it's read by PHP (to merge all the CSS files into one), this CSS has the following characters prepended to it: 
PHP removes all whitespace, so a random 
in the middle of the code messes up the entire thing. As I mentioned, I can't actually see these characters when I open the file in gedit, so I can't remove them very easily.
I googled the problem, and there is clearly something wrong with the file encoding, which makes sense being as I've been shifting the files around to different Linux/Windows servers via ftp and rsync, with a range of text editors. I don't really know much about character encoding though, so help would be appreciated.
If it helps, the file is being saved in UTF-8 format, and gedit won't let me save it in ISO-8859-15 format (the document contains one or more characters that cannot be encoded using the specified character encoding). I tried saving it with Windows and Linux line endings, but neither helped.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(23)
给您三个词:
字节顺序标记 (BOM)
这是 UTF-8 BOM 的表示形式ISO-8859-1。您必须告诉您的编辑不要使用 BOM,或者使用其他编辑器将其删除。
要自动删除 BOM,您可以使用
awk
,如 这个问题。正如另一个答案所说,最好的办法是 PHP 能够正确解释 BOM,为此您可以使用
mb_internal_encoding()
,像这样:Three words for you:
Byte Order Mark (BOM)
That's the representation for the UTF-8 BOM in ISO-8859-1. You have to tell your editor to not use BOMs or use a different editor to strip them out.
To automatize the BOM's removal you can use
awk
as shown in this question.As another answer says, the best would be for PHP to actually interpret the BOM correctly, for that you can use
mb_internal_encoding()
, like this:在 Notepad++ 中打开文件。从编码菜单中,选择转换为无BOM的UTF-8,保存文件,用此新文件替换旧文件。它一定会起作用的。
Open your file in Notepad++. From the Encoding menu, select Convert to UTF-8 without BOM, save the file, replace the old file with this new file. And it will work, damn sure.
在 PHP 中,您可以执行以下操作来删除所有非字符,包括相关字符。
In PHP, you can do the following to remove all non characters including the character in question.
对于那些具有 shell 访问权限的人来说,这里有一个小命令,用于查找 public_html 目录中设置了 BOM 的所有文件 - 请务必将其更改为服务器上的正确路径代码
:
如果您对 vi编辑器,在vi中打开文件:
并输入删除BOM的命令:
保存文件:
For those with shell access here is a little command to find all files with the BOM set in the public_html directory - be sure to change it to what your correct path on your server is
Code:
and if you are comfortable with the vi editor, open the file in vi:
And enter the command to remove the BOM:
Save the file:
BOM 只是一个字符序列(UTF-8 为 $EF $BB $BF),因此只需使用脚本删除它们或配置编辑器,这样就不会添加它们。
来自 从 UTF-8 中删除 BOM:
我确信它很容易转换为 PHP。
BOM is just a sequence of characters ($EF $BB $BF for UTF-8), so just remove them using scripts or configure the editor so it's not added.
From Removing BOM from UTF-8:
I am sure it translates to PHP easily.
我不了解 PHP,所以我不知道这是否可行,但最好的解决方案是将文件读取为 UTF-8 而不是其他编码。 BOM 实际上是一个零宽度无中断空间。这是空格,因此如果以正确的编码 (UTF-8) 读取文件,则 BOM 将被解释为空格,并且在生成的 CSS 文件中将被忽略。
此外,以正确的编码读取文件的另一个优点是您不必担心字符被误解。您的编辑器告诉您要保存它的代码页无法处理您需要的所有字符。如果 PHP 以不正确的编码读取文件,那么很可能除了 BOM 之外的其他字符正在被默默地误解。到处使用 UTF-8,这些问题就会消失。
I don't know PHP, so I don't know if this is possible, but the best solution would be to read the file as UTF-8 rather than some other encoding. The BOM is actually a ZERO WIDTH NO BREAK SPACE. This is whitespace, so if the file were being read in the correct encoding (UTF-8), then the BOM would be interpreted as whitespace and it would be ignored in the resulting CSS file.
Also, another advantage of reading the file in the correct encoding is that you don't have to worry about characters being misinterpreted. Your editor is telling you that the code page you want to save it in won't do all the characters that you need. If PHP is then reading the file in the incorrect encoding, then it is very likely that other characters besides the BOM are being silently misinterpreted. Use UTF-8 everywhere, and these problems disappear.
对我来说,这很有效:
如果我删除这个元,“ï”就会再次出现。希望这对某人有帮助...
For me, this worked:
If I remove this meta, the  appears again. Hope this helps someone...
可以用
awk 替换似乎可以,但是不太到位。
You can use
Replacing with awk seems to work, but it is not in place.
grep -rl $'\xEF\xBB\xBF' * | xargs vim -e -c 'argdo 设置文件编码=utf-8|设置编码=utf-8|设置nobomb| wq'
grep -rl $'\xEF\xBB\xBF' * | xargs vim -e -c 'argdo set fileencoding=utf-8|set encoding=utf-8| set nobomb| wq'
我的一些 PHP 文件中出现的 BOM 也遇到了同样的问题 (?????)。
如果您使用 PhpStorm 您可以在设置 -> 中设置热键将其删除。 IDE设置->键盘映射 ->主菜单->文件->删除物料清单。
I had the same problem with the BOM appearing in some of my PHP files ().
If you use PhpStorm you can set at hotkey to remove it in Settings -> IDE Settings -> Keymap -> Main Menu - > File -> Remove BOM.
在 Notepad++ 中,选择“编码”菜单,然后选择“以不带 BOM 的 UTF-8 进行编码”。然后保存。
请参阅堆栈溢出问题如何让记事本以UTF-8无BOM格式保存文本?.
In Notepad++, choose the "Encoding" menu, then "Encode in UTF-8 without BOM". Then save.
See Stack Overflow question How to make Notepad to save text in UTF-8 without BOM?.
在 Notepad++ 中打开有问题的 PHP 文件。
单击顶部的“编码”,然后从“无 BOM 的 UTF-8 编码”更改为“UTF-8 编码”。保存并覆盖服务器上的文件。
Open the PHP file under question, in Notepad++.
Click on Encoding at the top and change from "Encoding in UTF-8 without BOM" to just "Encoding in UTF-8". Save and overwrite the file on your server.
同样的问题,不同的解决方案。
PHP 文件中的一行打印出 XML 标头(使用与 PHP 相同的开始/结束标记)。看起来这些标签中的代码设置了编码,并在 PHP 中执行,导致了奇怪的字符。不管怎样,解决方案如下:
Same problem, different solution.
One line in the PHP file was printing out XML headers (which use the same begin/end tags as PHP). Looks like the code within these tags set the encoding, and was executed within PHP which resulted in the strange characters. Either way here's the solution:
如果您需要能够从 UTF-8 编码文件中删除 BOM,您首先需要找到一个能够识别它们的编辑器。
我个人使用E文本编辑器。
右下角有字符编码选项,包括BOM标签。加载文件,取消选择“字节顺序标记”(如果已选择),重新保存,然后就应该完成了。
替代文本 http://oth4.com/encoding.png
E 不是免费的,但有一个免费试用,它是一个出色的编辑器(TextMate 兼容性有限)。
If you need to be able to remove the BOM from UTF-8 encoded files, you first need to get hold of an editor that is aware of them.
I personally use E Text Editor.
In the bottom right, there are options for character encoding, including the BOM tag. Load your file, deselect Byte Order Marker if it is selected, resave, and it should be done.
Alt text http://oth4.com/encoding.png
E is not free, but there is a free trial, and it is an excellent editor (limited TextMate compatibility).
您可以通过 PhpStorm 打开它,然后右键单击您的文件,然后单击删除 BOM ...
You can open it by PhpStorm and right-click on your file and click on Remove BOM...
这是解决 BOM 问题的另一个好方法。这是两个 VBScript (.vbs) 脚本。
一种是在文件中查找 BOM,另一种是删除文件中该死的 BOM。它工作得很好而且很容易使用。
只需创建一个 .vbs 文件,并将以下代码粘贴到其中。
您只需将可疑文件拖放到 .vbs 文件上即可使用 VBScript 脚本。它会告诉您是否有 BOM。
如果它告诉您有 BOM,请使用以下代码创建第二个 .vbs 文件,并将可疑文件拖到该 .vbs 文件上。
代码来自 Heiko Jendreck< /a>.
Here is another good solution for the problem with BOM. These are two VBScript (.vbs) scripts.
One for finding the BOM in a file and one for KILLING the damned BOM in the file. It works pretty fine and is easy to use.
Just create a .vbs file, and paste the following code in it.
You can use the VBScript script simply by dragging and dropping the suspicious file onto the .vbs file. It will tell you if there is a BOM or not.
If it tells you there is BOM, go and create the second .vbs file with the following code and drag the suspicios file onto the .vbs file.
The code is from Heiko Jendreck.
在PHPStorm中,对于多个文件且BOM不一定位于文件开头,可以搜索
\x{FEFF}
(正则表达式)并替换为空。In PHPStorm, for multiple files and BOM not necessarily at the beginning of the file, you can search
\x{FEFF}
(Regular Expression) and replace with nothing.同样的问题,但它只影响一个文件,所以我只是创建了一个空白文件,将代码从原始文件复制/粘贴到新文件,然后替换原始文件。不花哨但它有效。
Same problem, but it only affected one file so I just created a blank file, copy/pasted the code from the original file to the new file, and then replaced the original file. Not fancy but it worked.
使用 Total Commander 搜索所有 BOMed 文件:
搜索带有 BOM 的 UTF-8 文件的优雅方法?
在适当的编辑器(可识别 BOM)中打开这些文件,例如 Eclipse。
将文件的编码更改为 ISO(右键单击,属性)。
从文件开头剪切 ,保存
将文件的编码改回 UTF-8
...甚至不要再考虑使用 n...d!
Use Total Commander to search for all BOMed files:
Elegant way to search for UTF-8 files with BOM?
Open these files in some proper editor (that recognizes BOM) like Eclipse.
Change the file's encoding to ISO (right click, properties).
Cut  from the beginning of the file, save
Change the file's encoding back to UTF-8
...and do not even think about using n...d again!
我也有同样的问题。问题是因为我的一个 php 文件是 utf-8 格式的(最重要的是,所有 php 文件中都包含配置文件)。
就我而言,我有两种适合我的不同解决方案:
首先,我通过在配置文件(或 .htaccess)中使用 AddDefaultCharsetDirective 更改了 Apache 配置。此解决方案强制 Apache 使用正确的编码。
第二个解决方案是更改 php 文件的错误编码。
I had the same problem. The problem was because one of my php files was in utf-8 (the most important, the configuaration file which is included in all php files).
In my case, I had 2 different solutions which worked for me :
First, I changed the Apache Configuration by using AddDefaultCharsetDirective in configuration files (or in .htaccess). This solution forces Apache to use the correct encodage.
The second solution was to change the bad encoding of the php file.
这对我有用!
This works for me!
检查您的
index.php
,找到“...charset=iso-8859-1
”并将其替换为“...charset=utf-8”
”。也许会起作用。
Check on your
index.php
, find "...charset=iso-8859-1
" and replace it with "...charset=utf-8
".Maybe it'll work.