如何删除 PHP 字符串中的 %EF%BB%BF

发布于 2024-09-30 01:41:29 字数 462 浏览 8 评论 0原文

我正在尝试使用 Microsoft Bing API。

$data = file_get_contents("http://api.microsofttranslator.com/V2/Ajax.svc/Speak?appId=APPID&text={$text}&language=ja&format=audio/wav");
$data = stripslashes(trim($data));

返回的数据在返回字符串的第一个字符中有一个“ ”字符。它不是空格，因为我在返回数据之前修剪了它。

' ' 字符结果是 %EF%BB%BF。

我想知道为什么会发生这种情况，也许是微软的错误？

如何在 PHP 中删除这个 %EF%BB%BF？

原文

I am trying to use the Microsoft Bing API.

$data = file_get_contents("http://api.microsofttranslator.com/V2/Ajax.svc/Speak?appId=APPID&text={$text}&language=ja&format=audio/wav");
$data = stripslashes(trim($data));

The data returned has a ' ' character in the first character of the returned string. It is not a space, because I trimed it before returning the data.

The ' ' character turned out to be %EF%BB%BF.

I wonder why this happened, maybe a bug from Microsoft?

How can I remove this %EF%BB%BF in PHP?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

≈。彩虹 2024-10-07 01:41:29

您不应简单地丢弃 BOM，除非您 100% 确定流将：(a) 始终为 UTF-8，并且 (b) 始终具有 UTF-8 BOM。

原因：

在 UTF-8 中，BOM 是可选 - 因此，如果服务在将来某个时刻停止发送它，您将丢弃响应的前三个字符。
BOM 的全部目的是明确识别被解释为 UTF-8 的 UTF 流的类型？ -16？或-32？，并且还指示编码信息的“字节序”（字节顺序）。如果你直接把它扔掉，你就会认为你总是得到 UTF-8；这可能不是一个很好的假设。
并非所有 BOM 都是 3 字节长，只有 UTF-8 是 3 字节长。 UTF-16 是两个字节，UTF-32 是四个字节。因此，如果服务将来切换到更广泛的 UTF 编码，您的代码将会崩溃。

我认为处理这个问题的更合适的方法是：

/* Detect the encoding, then convert from detected encoding to ASCII */
$enc = mb_detect_encoding($data);
$data = mb_convert_encoding($data, "ASCII", $enc);

You should not simply discard the BOM unless you're 100% sure that the stream will: (a) always be UTF-8, and (b) always have a UTF-8 BOM.

The reasons:

In UTF-8, a BOM is optional - so if the service quits sending it at some future point you'll be throwing away the first three characters of your response instead.
The whole purpose of the BOM is to identify unambiguously the type of UTF stream being interpreted UTF-8? -16? or -32?, and also to indicate the 'endian-ness' (byte order) of the encoded information. If you just throw it away you're assuming that you're always getting UTF-8; this may not be a very good assumption.
Not all BOMs are 3-bytes long, only the UTF-8 one is three bytes. UTF-16 is two bytes, and UTF-32 is four bytes. So if the service switches to a wider UTF encoding in the future, your code will break.

I think a more appropriate way to handle this would be something like:

/* Detect the encoding, then convert from detected encoding to ASCII */
$enc = mb_detect_encoding($data);
$data = mb_convert_encoding($data, "ASCII", $enc);

回复收藏 0 原文

裂开嘴轻声笑有多痛 2024-10-07 01:41:29

$data = file_get_contents("http://api.microsofttranslator.com/V2/Ajax.svc/Speak?appId=APPID&text={$text}&language=ja&format=audio/wav") ; $data = stripslashes(trim($data));

if (substr($data, 0, 3) == "\xef\xbb\xbf") { $data = substr($data, 3); }

回复收藏 0 原文

花开浅夏 2024-10-07 01:41:29

它是一个字节顺序标记 (BOM)，指示响应被编码为 UTF-8。您可以安全地删除它，但您应该将其余部分解析为 UTF-8。

回复收藏 0 原文

oО清风挽发oО 2024-10-07 01:41:29

我今天遇到了同样的问题，并通过确保字符串设置为 UTF-8 进行了修复：

http://php.net/manual/en/function.utf8-encode.php

$content = utf8_encode ( $content );

回复收藏 0 原文

忘东忘西忘不掉你 2024-10-07 01:41:29

要从字符串的开头删除它（仅）：

$data = preg_replace('/^%EF%BB%BF/', '', $data);

To remove it from the beginning of the string (only):

$data = preg_replace('/^%EF%BB%BF/', '', $data);

回复收藏 0 原文

甜扑 2024-10-07 01:41:29

$data = str_replace('%EF%BB%BF', '', $data);

您可能不应该使用 stripslashes —— 除非 API 返回 blackslashed数据（99.99% 的可能性不是），请接受该呼吁。

回复收藏 0 原文

时光病人 2024-10-07 01:41:29

您可以使用 substr 只获取其余部分，而无需 UTF-8 BOM：

// if it’s binary UTF-8
$data = substr($data, 3);
// if it’s percent-encoded UTF-8
$data = substr($data, 9);

You could use substr to only get the rest without the UTF-8 BOM:

// if it’s binary UTF-8
$data = substr($data, 3);
// if it’s percent-encoded UTF-8
$data = substr($data, 9);

回复收藏 0 原文

~没有更多了~

关于作者

影子是时光的心

暂无简介

0 文章

0 评论

24 人气

关注发私信

苦中寻乐

文章 0 评论 0

关注

lueluelue

文章 0 评论 0

关注

嗼ふ静

文章 0 评论 0

关注

王权女流氓

文章 0 评论 0

关注

与花如笺

文章 0 评论 0

关注

残酷

文章 0 评论 0

友情链接

文江博客

如何删除 PHP 字符串中的 %EF%BB%BF

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（7）

关于作者

相关话题

热门标签

推荐作者

苦中寻乐

lueluelue

嗼ふ静

王权女流氓

与花如笺

残酷

友情链接

如何删除 PHP 字符串中的 %EF%BB%BF

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（7）

关于作者

相关话题

热门标签

推荐作者

苦中寻乐

lueluelue

嗼ふ静

王权女流氓

与花如笺

残酷

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。