在 Google Chrome 中指定 blob 编码

发布于 2024-11-19 21:29:03 字数 285 浏览 3 评论 0原文

以下代码(供应商规范化)工作得很好,在 Firefox 8 中显示“➀➁➂ Test”,但在 Google Chrome 中显示“➀➁➂ Test”。除了使用文件系统 API 将文件写入临时文件系统之外,是否有任何方法可以保留 Google Chrome 中 blob 的编码?

var b = new Blob(["➀➁➂ Test"], {type: "text/plain;charset=UTF-8"});
var url = URL.createObjectURL(b);
open(url);

The following code (vendor normalized) works perfectly fine and displays "➀➁➂ Test" in Firefox 8, but displays "➀âžâž‚ Test" in Google Chrome. Is there any way to preserve encoding of blobs in Google Chrome short of writing a file to a temporary filesystem using the filesystem API?

var b = new Blob(["➀➁➂ Test"], {type: "text/plain;charset=UTF-8"});
var url = URL.createObjectURL(b);
open(url);

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

倦话 2024-11-26 21:29:03

new Blob(["➀➁➂ Test"]) 将生成一个表示该文本编码为 UTF-8 的 Blob。

浏览器假定应该以 ISO 格式读取文本文件对于 IMM 来说是一个奇怪的选择。

当浏览器通过 blob URI 提供服务时,附加 { type: "text/plain;charset=utf8" } 应该会生成正确的 Content-Type 标头。 Chrome 没有 open() 听起来像是一个错误。

现在,您可以通过在文本文件的开头添加 BOM 序列来解决此问题,以便 Chrome 将其检测为 UTF,即使没有 Content-Type 信息:

var BOM = new Uint8Array([0xEF,0xBB,0xBF]);
var b = new Blob([ BOM, "➀➁➂ Test" ]);
var url = URL.createObjectURL(b);
open(url);

var BOM = new Uint8Array([0xEF,0xBB,0xBF]);

var blob_BOM = new Blob([ BOM, "➀➁➂ Test" ]);
var url_BOM = URL.createObjectURL(blob_BOM);
// for demo we also create one version without BOM
var blob_noBOM = new Blob([ "➀➁➂ Test" ]);
var url_noBOM = URL.createObjectURL(blob_noBOM);

document.querySelector('.BOM').href = url_BOM;
document.querySelector('.no-BOM').href = url_noBOM;

// to check whether they contain the same data, apart from the BOM
(async() => {
  const buf_BOM = await blob_BOM.slice(3).arrayBuffer(); // remove BOM sequence
  const buf_noBOM = await blob_noBOM.arrayBuffer();
  
  console.log( 'with BOM text data:' );
  console.log( JSON.stringify( [...new Uint8Array( buf_BOM )] ) );
  console.log( 'without BOM text data:' );
  console.log( JSON.stringify( [...new Uint8Array( buf_noBOM )] ) );

})();
<a class="BOM">open file with BOM</a><br>
<a class="no-BOM">open file without BOM</a>

new Blob(["➀➁➂ Test"]) will generate a Blob representing that text encoded as UTF-8.

That browsers assumes text files should be read in ISO is a weird choice IMM.

Appending the { type: "text/plain;charset=utf8" } should generate the proper Content-Type header when they browsers will serve it through a blob URI. That Chrome doesn't with open() sounds like a bug.

Now you can workaround this by prepending a BOM sequence at the beginning of your text file, so that Chrome detects it as UTF, even without Content-Type info:

var BOM = new Uint8Array([0xEF,0xBB,0xBF]);
var b = new Blob([ BOM, "➀➁➂ Test" ]);
var url = URL.createObjectURL(b);
open(url);

var BOM = new Uint8Array([0xEF,0xBB,0xBF]);

var blob_BOM = new Blob([ BOM, "➀➁➂ Test" ]);
var url_BOM = URL.createObjectURL(blob_BOM);
// for demo we also create one version without BOM
var blob_noBOM = new Blob([ "➀➁➂ Test" ]);
var url_noBOM = URL.createObjectURL(blob_noBOM);

document.querySelector('.BOM').href = url_BOM;
document.querySelector('.no-BOM').href = url_noBOM;

// to check whether they contain the same data, apart from the BOM
(async() => {
  const buf_BOM = await blob_BOM.slice(3).arrayBuffer(); // remove BOM sequence
  const buf_noBOM = await blob_noBOM.arrayBuffer();
  
  console.log( 'with BOM text data:' );
  console.log( JSON.stringify( [...new Uint8Array( buf_BOM )] ) );
  console.log( 'without BOM text data:' );
  console.log( JSON.stringify( [...new Uint8Array( buf_noBOM )] ) );

})();
<a class="BOM">open file with BOM</a><br>
<a class="no-BOM">open file without BOM</a>

装迷糊 2024-11-26 21:29:03

Gecko (Firefox)、WebKit (Safari、Chrome) 和 Opera 支持非标准 btoa 函数,用于以 Base 64 编码字符串。为了获得包含编码为 UTF- 的字符串的 Base 64 字符串, 8 你需要使用 encodeURIComponent-unescape 技巧。 encodeURIComponent 将字符串编码为 UTF-8 URL,但 unescape 将每个 %xx 解码为单个字符。 btoa 需要一个您想要的任何编码的二进制字符串。

var base64 = btoa(unescape(encodeURIComponent(data)));
window.open("data:text/plain;charset=UTF-8;base64,"+base64,"UTF-8 Text");

当然这在 IE 中不起作用,但我认为 IE 10 将支持 Blob-API。谁知道它将如何处理编码。

PS:IE 似乎无法 window.open data:-urls 并且无论如何都会有一个可笑的小 url 长度限制。

PPS:这在 Chrome 中对我有用:

var b = new Blob(["➀➁➂ Test"],{encoding:"UTF-8",type:"text/plain;charset=UTF-8"});
var url = URL.createObjectURL(b);
window.open(url,"_blank","");

Gecko (Firefox), WebKit (Safari, Chrome) and Opera support the non-standard btoa function for encoding a string in base 64. In order to get a base 64 string containing a string encoded as UTF-8 you need to use the encodeURIComponent-unescape trick. encodeURIComponent encodes a string as UTF-8 URL but unescape decodes each %xx as a single character. btoa expects a binary string of whatever encoding you want.

var base64 = btoa(unescape(encodeURIComponent(data)));
window.open("data:text/plain;charset=UTF-8;base64,"+base64,"UTF-8 Text");

Of course this does not work in IE, but I think IE 10 will support the Blob-API. Who knows how it will handle encodings.

PS: IE seems not to be able to window.open data:-urls and would have a ridiculous small url length limitation anyway.

PPS: This works for me in Chrome:

var b = new Blob(["➀➁➂ Test"],{encoding:"UTF-8",type:"text/plain;charset=UTF-8"});
var url = URL.createObjectURL(b);
window.open(url,"_blank","");
眼睛会笑 2024-11-26 21:29:03

问题在于 Chrome 中新标签页的默认页面编码。当新窗口打开时(在 window.open(url) 之后),选择“View”>“编码> Chrome 菜单中的 Unicode。这将 Chrome 13 中显示的文本从“➀➁➂ Test”更改为“➀➁➂ Test”。

如果您想要一个解决方案,无论默认编码如何,都可以在新窗口中打开 blob,那么您可以依赖事实上,当 iframe 中的文档未明确指定自己的编码时,它将继承父文档编码。因此,您可以打开一个窗口,其中包含一个带有 Content-Type:text/html; 的空白 HTML 文档。 charset=utf-8 标头,然后将 iframe 附加到正文,并将 src 属性设置为 blob URL。

The problem is the default page encoding for new tabs in Chrome. When the new window opens (after window.open(url)) choose View > Encoding > Unicode from the Chrome menu. This changed the displayed text from "➀âžâž‚ Test" to "➀➁➂ Test" for me in Chrome 13.

If you want a solution that will let you open blobs in new windows regardless of the default encoding, then you can rely on the fact that a document in an iframe will inherit the parent document encoding when it doesn't explicitly specify its own encoding. So you can open a window with a blank HTML document served with a Content-Type:text/html; charset=utf-8 header, then append an iframe to the body with the src attribute set to the blob URL.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文