字符串和 ArrayBuffer 之间的转换
是否有一种普遍接受的技术可以有效地将 JavaScript 字符串转换为 ArrayBuffers 反之亦然?具体来说,我希望能够将 ArrayBuffer 的内容写入 localStorage
,然后将其读回。
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(29)
是的:
Yes:
在使用了 mangini 的从 ArrayBuffer 转换为 String 的解决方案后 -
ab2str
(这是我发现的最优雅、最有用的解决方案 - 谢谢!) ,我在处理大型数组时遇到了一些问题。更具体地说,调用 String.fromCharCode.apply(null, new Uint16Array(buf)); 会引发错误:传递给 Function.prototype.apply 的参数数组太大。
为了解决这个问题(绕过),我决定以块的形式处理输入
ArrayBuffer
。因此修改后的解决方案是:块大小设置为
2^16
因为这是我发现在我的开发环境中工作的大小。设置较高的值会导致再次出现相同的错误。可以通过将CHUNK_SIZE
变量设置为不同的值来更改它。拥有偶数很重要。关于性能的说明 - 我没有对此解决方案进行任何性能测试。但是,由于它基于以前的解决方案,并且可以处理大型数组,因此我认为没有理由不使用它。
After playing with mangini's solution for converting from
ArrayBuffer
toString
-ab2str
(which is the most elegant and useful one I have found - thanks!), I had some issues when handling large arrays. More specefivally, callingString.fromCharCode.apply(null, new Uint16Array(buf));
throws an error:arguments array passed to Function.prototype.apply is too large
.In order to solve it (bypass) I have decided to handle the input
ArrayBuffer
in chunks. So the modified solution is:The chunk size is set to
2^16
because this was the size I have found to work in my development landscape. Setting a higher value caused the same error to reoccur. It can be altered by setting theCHUNK_SIZE
variable to a different value. It is important to have an even number.Note on performance - I did not make any performance tests for this solution. However, since it is based on the previous solution, and can handle large arrays, I see no reason why not to use it.
atob() 返回的“本机”二进制字符串是每个字符 1 字节的数组。
所以我们不应该将 2 个字节存储到一个字符中。
The "native" binary string that atob() returns is a 1-byte-per-character Array.
So we shouldn't store 2 byte into a character.
ArrayBuffer
->缓冲区
->String(Base64)
将
ArrayBuffer
更改为Buffer
,然后更改为String
。ArrayBuffer
->Buffer
->String(Base64)
Change
ArrayBuffer
toBuffer
and then toString
.使用 splat 解包而不是循环:
对于子字符串,可以使用arrbuf.slice()。
Use splat unpacking instead of loops:
For substrings
arrbuf.slice()
can be employed.我建议不要使用已弃用的 API,例如 BlobBuilder
BlobBuilder 早已被 Blob 对象弃用。将 Dennis 的答案中的代码(其中使用了 BlobBuilder)与下面的代码进行比较:
请注意,与已弃用的方法相比,该方法更加干净且不那么臃肿......是的,这绝对是这里需要考虑的事情。
I'd recommend NOT using deprecated APIs like BlobBuilder
BlobBuilder has long been deprecated by the Blob object. Compare the code in Dennis' answer — where BlobBuilder is used — with the code below:
Note how much cleaner and less bloated this is compared to the deprecated method... Yeah, this is definitely something to consider here.
请参阅 https://developer.mozilla.org/en-US/文档/Web/API/TextDecoder/解码
See https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder/decode
来自 emscripten:
使用如下:
From emscripten:
Use like:
2016 年更新 - 五年过去了,规范中现在有了新方法(请参阅下面的支持),可以使用正确的编码在字符串和类型化数组之间进行转换。
TextEncoder
TextEncoder
表示:自上述内容编写以来更改注释:(同上。)
*) 更新规范 (W3) 和 此处(whatwg)。
创建
TextEncoder
的实例后,它将获取一个字符串并使用给定的编码参数对其进行编码:当然,如果需要,您可以在生成的
Uint8Array
上使用.buffer
参数将底层ArrayBuffer
转换为不同的视图。只需确保字符串中的字符遵循编码架构,例如,如果您在示例中使用 UTF-8 范围之外的字符,它们将被编码为两个字节而不是一个字节。
对于一般用途,您可以对
localStorage
等内容使用 UTF-16 编码。TextDecoder
同样,相反的过程 使用
TextDecoder
< /a>:所有可用的解码类型都可以在此处找到。
MDN StringView 库
另一种方法是使用
StringView< /code> 库
(许可为 lgpl-3.0),其目标是:
从而提供更大的灵活性。但是,当
TextEncoder
/TextDecoder
内置在现代浏览器中时,它需要我们链接或嵌入此库。支持
截至 2018 年 7 月的
:
TextEncoder
(实验性,在标准轨道上)Update 2016 - five years on there are now new methods in the specs (see support below) to convert between strings and typed arrays using proper encoding.
TextEncoder
The
TextEncoder
represents:Change note since the above was written: (ibid.)
*) Updated specs (W3) and here (whatwg).
After creating an instance of the
TextEncoder
it will take a string and encode it using a given encoding parameter:You then of course use the
.buffer
parameter on the resultingUint8Array
to convert the underlayingArrayBuffer
to a different view if needed.Just make sure that the characters in the string adhere to the encoding schema, for example, if you use characters outside the UTF-8 range in the example they will be encoded to two bytes instead of one.
For general use you would use UTF-16 encoding for things like
localStorage
.TextDecoder
Likewise, the opposite process uses the
TextDecoder
:All available decoding types can be found here.
The MDN StringView library
An alternative to these is to use the
StringView
library (licensed as lgpl-3.0) which goal is:giving much more flexibility. However, it would require us to link to or embed this library while
TextEncoder
/TextDecoder
is being built-in in modern browsers.Support
As of July/2018:
TextEncoder
(Experimental, On Standard Track)尽管 Dennis 和 gengkev 使用 Blob/FileReader 的解决方案有效,但我不建议采用这种方法。这是解决简单问题的异步方法,并且比直接解决方案慢得多。我在 html5rocks 中发表了一篇文章,其中提供了一个更简单且(更快)的解决方案:
http://updates.html5rocks.com/2012/06/如何将 ArrayBuffer 转换为字符串
解决方案是:
编辑:
编码API有助于解决字符串转换问题。查看 Jeff Posnik 在 Html5Rocks.com 上对上述原始文章的回复。
摘抄:
Although Dennis and gengkev solutions of using Blob/FileReader work, I wouldn't suggest taking that approach. It is an async approach to a simple problem, and it is much slower than a direct solution. I've made a post in html5rocks with a simpler and (much faster) solution:
http://updates.html5rocks.com/2012/06/How-to-convert-ArrayBuffer-to-and-from-String
And the solution is:
EDIT:
The Encoding API helps solving the string conversion problem. Check out the response from Jeff Posnik on Html5Rocks.com to the above original article.
Excerpt:
您可以使用编码标准TextEncoder和
TextDecoder
a>,由 stringencoding 库 填充,用于在字符串与 ArrayBuffer 之间进行转换:You can use
TextEncoder
andTextDecoder
from the Encoding standard, which is polyfilled by the stringencoding library, to convert string to and from ArrayBuffers:Blob 比 String.fromCharCode(null,array); 慢得多,
但如果数组缓冲区太大,则会失败。我发现的最佳解决方案是使用 String.fromCharCode(null,array); 并将其拆分为不会破坏堆栈的操作,但一次比单个字符更快。
对于大数组缓冲区的最佳解决方案是:
我发现这比使用 blob 快大约 20 倍。它也适用于超过 100mb 的大字符串。
Blob is much slower than
String.fromCharCode(null,array);
but that fails if the array buffer gets too big. The best solution I have found is to use
String.fromCharCode(null,array);
and split it up into operations that won't blow the stack, but are faster than a single char at a time.The best solution for large array buffer is:
I found this to be about 20 times faster than using blob. It also works for large strings of over 100mb.
如果字符串中有二进制数据(从
nodejs
+readFile(..., 'binary')
或cypress
+ < 获取code>cy.fixture(..., 'binary') 等),您不能使用TextEncoder
。它仅支持utf8
。值为>= 128
的字节每个都转换为 2 个字节。ES2015:
Uint8Array(33) [2, 134, 140, 186, 82, 70, 108, 182, 233, 40, 143, 247, 29, 76, 245, 206, 29, 87, 48, 160, 78, 225 , 242, 56, 236, 201, 80, 80, 152, 118, 92, 144, 48
"°RFl¶é(÷LõÎW0 Náò8ìÉPPv\0"
In case you have binary data in a string (obtained from
nodejs
+readFile(..., 'binary')
, orcypress
+cy.fixture(..., 'binary')
, etc), you can't useTextEncoder
. It supports onlyutf8
. Bytes with values>= 128
are each turned into 2 bytes.ES2015:
Uint8Array(33) [2, 134, 140, 186, 82, 70, 108, 182, 233, 40, 143, 247, 29, 76, 245, 206, 29, 87, 48, 160, 78, 225, 242, 56, 236, 201, 80, 80, 152, 118, 92, 144, 48
"ºRFl¶é(÷LõÎW0 Náò8ìÉPPv\0"
根据 gengkev 的答案,我为这两种方式创建了函数,因为 BlobBuilder 可以处理 String 和 ArrayBuffer:
以及
一个简单的测试:
Based on the answer of gengkev, I created functions for both ways, because BlobBuilder can handle String and ArrayBuffer:
and
A simple test:
只是
或者
为什么你们都让这件事变得如此复杂?
Just
Or
Why are you all making this so complicated?
以下所有内容都是关于从数组缓冲区获取二进制字符串,
我建议不要使用
,因为它
如果您确实需要同步解决方案,请使用类似的方法
,速度很慢前一个但工作正常。在撰写本文时,似乎还没有针对该问题的非常快速的同步解决方案(本主题中提到的所有库都对其同步功能使用相同的方法)。
但我真正推荐的是使用
Blob
+FileReader
方法,唯一的缺点(并非全部)是它是异步。而且它比以前的解决方案快大约8-10 倍! (一些细节:我的环境中的同步解决方案对于 2.4Mb 缓冲区需要 950-1050 毫秒,但使用 FileReader 的解决方案对于相同数量的数据大约需要 100-120 毫秒。我已经测试了两种同步解决方案在 100Kb 缓冲区上,它们几乎花费了相同的时间,因此循环并不比使用“apply”慢多少。)
顺便说一句:如何将 ArrayBuffer 与 String 相互转换 作者比较了两种方法像我一样,得到完全相反的结果(他的测试代码在这里)为什么结果如此不同?可能是因为他的测试字符串有 1Kb 长(他称之为“veryLongStr”)。我的缓冲区是一个非常大的 JPEG 图像,大小为 2.4Mb。
All the following is about getting binary strings from array buffers
I'd recommend not to use
because it
Maximum call stack size exceeded
error on 120000 bytes buffer (Chrome 29))If you exactly need synchronous solution use something like
it is as slow as the previous one but works correctly. It seems that at the moment of writing this there is no quite fast synchronous solution for that problem (all libraries mentioned in this topic uses the same approach for their synchronous features).
But what I really recommend is using
Blob
+FileReader
approachthe only disadvantage (not for all) is that it is asynchronous. And it is about 8-10 times faster then previous solutions! (Some details: synchronous solution on my environment took 950-1050 ms for 2.4Mb buffer but solution with FileReader had times about 100-120 ms for the same amount of data. And I have tested both synchronous solutions on 100Kb buffer and they have taken almost the same time, so loop is not much slower the using 'apply'.)
BTW here: How to convert ArrayBuffer to and from String author compares two approaches like me and get completely opposite results (his test code is here) Why so different results? Probably because of his test string that is 1Kb long (he called it "veryLongStr"). My buffer was a really big JPEG image of size 2.4Mb.
(更新请参阅此答案的第二部分,我(希望)提供了一个更完整的解决方案。)
我也遇到了这个问题,以下内容适用于我在 FF 6 中(针对一个方向) ):
当然,不幸的是,您最终得到的是数组中值的 ASCII 文本表示形式,而不是字符。不过,它仍然(应该)比循环更有效。
例如。对于上面的示例,结果是
0004000000
,而不是几个空字符 &一个字符(4)。编辑:
查看MDC此处后,您可以从
Array
创建一个ArrayBuffer
,如下所示:为了回答您原来的问题,这允许您将
ArrayBuffer
<-> 转换为ArrayBuffer
。String
如下:为了方便起见,这里有一个
函数
,用于将原始 UnicodeString
转换为ArrayBuffer
(将仅适用于 ASCII/单字节字符)以上允许您从
ArrayBuffer
->字符串
&再次回到ArrayBuffer,字符串可能存储在例如。.localStorage
:)希望这有帮助,
丹
(Update Please see the 2nd half of this answer, where I have (hopefully) provided a more complete solution.)
I also ran into this issue, the following works for me in FF 6 (for one direction):
Unfortunately, of course, you end up with ASCII text representations of the values in the array, rather than characters. It still (should be) much more efficient than a loop, though.
eg. For the example above, the result is
0004000000
, rather than several null chars & a chr(4).Edit:
After looking on MDC here, you may create an
ArrayBuffer
from anArray
as follows:To answer your original question, this allows you to convert
ArrayBuffer
<->String
as follows:For convenience, here is a
function
for converting a raw UnicodeString
to anArrayBuffer
(will only work with ASCII/one-byte characters)The above allow you to go from
ArrayBuffer
->String
& back toArrayBuffer
again, where the string may be stored in eg..localStorage
:)Hope this helps,
Dan
与此处的解决方案不同,我需要在 UTF-8 数据之间进行转换。为此,我使用 (un)escape/(en)decodeURIComponent 技巧编写了以下两个函数。它们非常浪费内存,分配了编码 utf8 字符串长度的 9 倍,尽管这些应该由 gc 来恢复。只是不要将它们用于 100mb 的文本。
检查它是否有效:
Unlike the solutions here, I needed to convert to/from UTF-8 data. For this purpose, I coded the following two functions, using the (un)escape/(en)decodeURIComponent trick. They're pretty wasteful of memory, allocating 9 times the length of the encoded utf8-string, though those should be recovered by gc. Just don't use them for 100mb text.
Checking that it works:
对于 node.js 以及使用 https://github.com/feross/buffer 的浏览器
注意:解决方案这里不适合我。我需要支持 node.js 和浏览器,并将 UInt8Array 序列化为字符串。我可以将其序列化为数字[],但这会占用不必要的空间。有了这个解决方案,我不需要担心编码,因为它是 base64。以防万一其他人遇到同样的问题......我的两分钱
For node.js and also for browsers using https://github.com/feross/buffer
Note: Solutions here didn't work for me. I need to support node.js and browsers and just serialize UInt8Array to a string. I could serialize it as a number[] but that occupies unnecessary space. With that solution I don't need to worry about encodings since it's base64. Just in case other people struggle with the same problem... My two cents
我发现这种方法存在问题,基本上是因为我试图将输出写入文件,但它没有正确编码。由于JS似乎使用UCS-2编码(源,source),我们需要进一步扩展这个解决方案,这是我的增强解决方案,适合我。
我对通用文本没有任何困难,但当它是阿拉伯语或韩语时,输出文件没有所有字符,而是显示错误字符
文件输出:
<代码>
","10k 单位":"",关注:"Õ©íüY‹","关注 %{screen_name}":"%{screen_name}U“'Õ©íü",推文:"ä¤üÈ","推文%{hashtag}":"%{hashtag} 'ä¤üÈY´","推文至%{name}":"%{name}U“xä¤üÈY´"},ko:{"%{followers_count} 关注者":"%{followers_count}…X \Ì","100K+":"100Ì tÁ" ,"10k 单位":"Ì è",关注:"\°","关注%{screen_name}":"%{screen_name} Ø \°X0",K:"œ",M:"1Ì",Tweet:"¸","Tweet %{hashtag}":"%{hashtag}
原文:
<代码>
","10k 单位":"万",关注:"fuォローする","关注%{screen_name}":"%{screen_name}さんをfuォロー",推文:"ツイート","推文%{hashtag}": “%{井号}をツイートする","推文至 %{name}":"%{name}さんへツイートする"},ko:{"%{followers_count} 关注者":"%{followers_count}명의 팔로워","100K+": “100分상","10k 单位":"만 단위",关注:"팔로우","关注 %{screen_name}":"%{screen_name} 님 팔로우하기",K:"천",M:"백만",Tweet:"트윗","Tweet %{hashtag}":"%{hashtag}
我从 获取信息丹尼斯的解决方案和这个我发现帖子。
这是我的代码:
这允许我将内容保存到文件中而不会出现编码问题。
工作原理:
它基本上采用组成 UTF-8 字符的单个 8 字节块并将它们保存为单个字符(因此以这种方式构建的 UTF-8 字符可以由这些字符中的 1-4 个组成)。
UTF-8 以长度从 1 到 4 个字节不等的格式对字符进行编码。我们在这里所做的是将字符串编码到 URI 组件中,然后将该组件转换为相应的 8 字节字符。这样我们就不会丢失长度超过 1 个字节的 UTF8 字符给出的信息。
I found I had problems with this approach, basically because I was trying to write the output to a file and it was non encoded properly. Since JS seems to use UCS-2 encoding (source, source), we need to stretch this solution a step further, here's my enhanced solution that works to me.
I had no difficulties with generic text, but when it was down to Arab or Korean, the output file didn't have all the chars but instead was showing error characters
File output:
","10k unit":"",Follow:"Õ©íüY‹","Follow %{screen_name}":"%{screen_name}U“’Õ©íü",Tweet:"ĤüÈ","Tweet %{hashtag}":"%{hashtag} ’ĤüÈY‹","Tweet to %{name}":"%{name}U“xĤüÈY‹"},ko:{"%{followers_count} followers":"%{followers_count}…X \Ì","100K+":"100Ì tÁ","10k unit":"Ì è",Follow:"\°","Follow %{screen_name}":"%{screen_name} Ø \°X0",K:"œ",M:"1Ì",Tweet:"¸","Tweet %{hashtag}":"%{hashtag}
Original:
","10k unit":"万",Follow:"フォローする","Follow %{screen_name}":"%{screen_name}さんをフォロー",Tweet:"ツイート","Tweet %{hashtag}":"%{hashtag} をツイートする","Tweet to %{name}":"%{name}さんへツイートする"},ko:{"%{followers_count} followers":"%{followers_count}명의 팔로워","100K+":"100만 이상","10k unit":"만 단위",Follow:"팔로우","Follow %{screen_name}":"%{screen_name} 님 팔로우하기",K:"천",M:"백만",Tweet:"트윗","Tweet %{hashtag}":"%{hashtag}
I took the information from dennis' solution and this post I found.
Here's my code:
This allows me to save the content to a file without encoding problems.
How it works:
It basically takes the single 8-byte chunks composing a UTF-8 character and saves them as single characters (therefore an UTF-8 character built in this way, could be composed by 1-4 of these characters).
UTF-8 encodes characters in a format that variates from 1 to 4 bytes in length. What we do here is encoding the sting in an URI component and then take this component and translate it in the corresponding 8 byte character. In this way we don't lose the information given by UTF8 characters that are more than 1 byte long.
如果您使用巨大的数组示例
arr.length=1000000
您可以使用此代码来避免
反向函数的 堆栈回调问题
来自顶部的mangini答案
if you used huge array example
arr.length=1000000
you can this code to avoid stack callback problems
reverse function
mangini answer from top
以下是一个有效的 Typescript 实现:
在使用 crypto.subtle 时,我已将其用于许多操作。
The following is a working Typescript implementation:
I've used this for numerous operations while working with crypto.subtle.
嗯,这里有一个有点复杂的方法来做同样的事情:
编辑: BlobBuilder 早已被弃用,取而代之的是 Blob 构造函数,当我第一次写这篇文章时,Blob 构造函数还不存在。这是更新版本。 (是的,这一直是一种非常愚蠢的转换方式,但它只是为了好玩!)
Well, here's a somewhat convoluted way of doing the same thing:
Edit: BlobBuilder has long been deprecated in favor of the Blob constructor, which did not exist when I first wrote this post. Here's an updated version. (And yes, this has always been a very silly way to do the conversion, but it was just for fun!)
最近,我也需要为我的一个项目执行此操作,因此做了一项深入的研究,并从 Google 开发者社区得到了一个结果,该结果以简单的方式说明了这一点:
For ArrayBuffer to String
For String to ArrayBuffer
有关更多详细参考,您可以参考 Google 的此博客。
Recently I also need to do this for one of my project so did a well research and got a result from Google's Developer community which states this in a simple manner:
For ArrayBuffer to String
For String to ArrayBuffer
For more in detail reference you can refer this blog by Google.
对我来说这很有效。
For me this worked well.
假设您有一个 arrayBuffer binaryStr:,
然后将文本分配给状态。
Let's say you have an arrayBuffer binaryStr:
and then you assign the text to the state.
请参阅此处:https://developer.mozilla.org/en- US/docs/Web/JavaScript/Typed_arrays/StringView
(基于 JavaScript ArrayBuffer 接口的类似 C 的字符串接口)
See here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Typed_arrays/StringView
(a C-like interface for strings based upon the JavaScript ArrayBuffer interface)
我用过这个并且为我工作。
I used this and works for me.