如何删除 PHP 字符串中的 %EF%BB%BF
我正在尝试使用 Microsoft Bing API。
$data = file_get_contents("http://api.microsofttranslator.com/V2/Ajax.svc/Speak?appId=APPID&text={$text}&language=ja&format=audio/wav");
$data = stripslashes(trim($data));
返回的数据在返回字符串的第一个字符中有一个“ ”字符。它不是空格,因为我在返回数据之前修剪了它。
' ' 字符结果是 %EF%BB%BF。
我想知道为什么会发生这种情况,也许是微软的错误?
如何在 PHP 中删除这个 %EF%BB%BF?
I am trying to use the Microsoft Bing API.
$data = file_get_contents("http://api.microsofttranslator.com/V2/Ajax.svc/Speak?appId=APPID&text={$text}&language=ja&format=audio/wav");
$data = stripslashes(trim($data));
The data returned has a ' ' character in the first character of the returned string. It is not a space, because I trimed it before returning the data.
The ' ' character turned out to be %EF%BB%BF.
I wonder why this happened, maybe a bug from Microsoft?
How can I remove this %EF%BB%BF in PHP?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
您不应简单地丢弃 BOM,除非您 100% 确定流将:(a) 始终为 UTF-8,并且 (b) 始终具有 UTF-8 BOM。
原因:
我认为处理这个问题的更合适的方法是:
You should not simply discard the BOM unless you're 100% sure that the stream will: (a) always be UTF-8, and (b) always have a UTF-8 BOM.
The reasons:
I think a more appropriate way to handle this would be something like:
$data = file_get_contents("http://api.microsofttranslator.com/V2/Ajax.svc/Speak?appId=APPID&text={$text}&language=ja&format=audio/wav") ;
$data = stripslashes(trim($data));
if (substr($data, 0, 3) == "\xef\xbb\xbf") {
$data = substr($data, 3);
}
$data = file_get_contents("http://api.microsofttranslator.com/V2/Ajax.svc/Speak?appId=APPID&text={$text}&language=ja&format=audio/wav");
$data = stripslashes(trim($data));
if (substr($data, 0, 3) == "\xef\xbb\xbf") {
$data = substr($data, 3);
}
它是一个 字节顺序标记 (BOM),指示响应被编码为 UTF-8。您可以安全地删除它,但您应该将其余部分解析为 UTF-8。
It's a byte order mark (BOM), indicating the response is encoded as UTF-8. You can safely remove it, but you should parse the remainder as UTF-8.
我今天遇到了同样的问题,并通过确保字符串设置为 UTF-8 进行了修复:
http://php.net/manual/en/function.utf8-encode.php
$content = utf8_encode ( $content );
I had the same problem today, and fixed by ensuring the string was set to UTF-8:
http://php.net/manual/en/function.utf8-encode.php
$content = utf8_encode ( $content );
要从字符串的开头删除它(仅):
To remove it from the beginning of the string (only):
$data = str_replace('%EF%BB%BF', '', $data);
您可能不应该使用
stripslashes
—— 除非 API 返回 blackslashed数据(99.99% 的可能性不是),请接受该呼吁。$data = str_replace('%EF%BB%BF', '', $data);
You probably shouldn't be using
stripslashes
-- unless the API returns blackslashed data (and 99.99% chance it doesn't), take that call out.您可以使用
substr
只获取其余部分,而无需 UTF-8 BOM:You could use
substr
to only get the rest without the UTF-8 BOM: