使用 localStorage 进行 JavaScript 字符串压缩
我在项目中使用 localStorage
,它需要存储大量数据,大部分是 int、bool 和 string 类型。我知道 javascript 字符串是 unicode,但是当存储在 localStorage
中时,它们仍然保持 unicode 吗?如果是这样,有没有办法可以压缩字符串以使用 unicode 字节中的所有数据,或者我应该只使用 base64 并减少压缩?所有数据都将存储为一个大字符串。
编辑:现在我想了一下,base64根本不会做太多压缩,数据已经是base 64,a-zA-Z0-9 ;:
是65个字符。
I am using localStorage
in a project, and it will need to store lots of data, mostly of type int, bool and string. I know that javascript strings are unicode, but when stored in localStorage
, do they stay unicode? If so, is there a way I could compress the string to use all of the data in a unicode byte, or should i just use base64 and have less compression? All of the data will be stored as one large string.
EDIT: Now that I think about it, base64 wouldn't do much compression at all, the data is already in base 64, a-zA-Z0-9 ;:
is 65 characters.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
“当存储在 localStorage 中时,它们会保留 unicode 吗?”
Web Storage 工作草案将本地存储值定义为 DOMString。 DOMString 被定义为 的序列使用 UTF-16 编码 的 16 位单元。所以是的,它们仍然是 Unicode。
有没有办法可以压缩字符串以使用 unicode 字节中的所有数据...?
“Base32k”编码应该为每个字符提供 15 位。 Base32k 类型编码利用 UTF-16 字符中的完整 16 位,但会丢失一点以避免双字字符出错。如果您的原始数据是 Base64 编码的,则每个字符仅使用 6 位。将这 6 位编码为 base32k 应该将其压缩至原始大小的 6/15 = 40%。请参阅http://lists.xml.org/archives/xml-dev /200307/msg00505.html 和 http://lists.xml.org/archives/xml-dev/200307/ msg00507.html。
为了进一步减小大小,您可以将 Base64 字符串解码为完整的 8 位二进制文件,使用一些已知的压缩算法对其进行压缩(例如,请参阅 gzip 的 JavaScript 实现),然后对压缩输出进行 Base32k 编码。
"when stored in localStorage, do they stay unicode?"
The Web Storage working draft defines local storage values as DOMString. DOMStrings are defined as sequences of 16-bit units using the UTF-16 encoding. So yes, they stay Unicode.
is there a way I could compress the string to use all of the data in a unicode byte...?
"Base32k" encoding should give you 15 bits per character. A base32k-type encoding takes advantage of the full 16 bits in UTF-16 characters, but loses a bit to avoid tripping on double-word characters. If your original data is base64 encoded, it only uses 6 bits per character. Encoding those 6 bits into base32k should compress it to 6/15 = 40% of its original size. See http://lists.xml.org/archives/xml-dev/200307/msg00505.html and http://lists.xml.org/archives/xml-dev/200307/msg00507.html.
For even further reduction in size, you can decode your base64 strings into their full 8-bit binary, compress them with some known compression algorithm (e.g. see javascript implementation of gzip), and then base32k encode the compressed output.
您可以编码为 Base64,然后实现简单的无损压缩算法,例如行程编码或 Golomb 编码。这应该不会太难做到,并且可能会给你带来一点压力。
Golomb 编码
我还发现了 JsZip。我想您可以检查代码并仅使用该算法(如果兼容)。
希望这有帮助。
http://jszip.stuartk.co.uk/
You could encode to Base64 and then implement a simple lossless compression algorithm, such as run-length encoding or Golomb encoding. This shouldn't be too hard to do and might give you a bit of ompression.
Golomb encoding
I also found JsZip. I guess you could check the code and only use the algorithm, if it is compatible.
Hope this helps.
http://jszip.stuartk.co.uk/
我最近不得不在 localStorage 中保存巨大的 JSON 对象。
首先,是的,它们确实保留了 unicode。但不要尝试将对象(例如对象)直接保存到本地存储。它必须是一个字符串。
在将对象转换为字符串之前,以下是我使用的一些压缩技术(在我的情况下似乎效果很好):
通过执行 (+num) 之类的操作,可以将任何数字从 10 基数转换为 36 基数。到字符串(36)。例如,数字 48346942 将是“ss8qm”,即少 1 个字符(包括引号)。添加引号实际上可能会增加字符数。所以数字越大,回报就越好。要将其转换回来,您可以执行类似 parseInt("ss8qm", 36) 的操作。
如果您存储的对象带有任何会重复的键,最好创建一个查找对象,在其中将缩短的键分配给原始键。因此,举例来说,如果您有:
那么您可以做到:
再次强调,这会随着规模的大小而得到回报。并重复。就我而言,效果非常好。但这取决于主题。
所有这些都需要一个收缩函数和一个扩展函数。
另外,我建议创建一个用于存储和保存的类。从本地存储中检索数据。我遇到了空间不够的情况。所以写入会失败。其他站点也可能写入本地存储,这可能会占用一些空间。请参阅这篇文章< /a> 了解更多详细信息。
在我构建的类中,我所做的是首先尝试删除具有给定键的任何项目。然后尝试 setItem。这两行用 try catch 包裹起来。如果失败,则假定存储已满。然后它会清除 localStorage 中的所有内容,试图为其腾出空间。然后,在清除之后,它将尝试再次设置项目。这也包含在 try catch 中。因为如果字符串本身大于 localStorage 可以处理的大小,则可能会失败。
编辑:此外,您还会遇到很多人提到的 LZW 压缩。我已经实现了它,并且它适用于小字符串。但对于大字符串,它会开始使用无效字符,从而导致数据损坏。所以要小心,如果你往那个方向走,测试测试测试
I recently had to save huge JSON objects in localStorage.
Firstly, yeah, they do stay unicode. But don't try to save something like an object straight to local storage. It needs to be a string.
Here are some compression techniques I used (that seemed to work well in my case), before converting my object to a string:
Any numbers can be converted from a base of 10 to a base of 36 by doing something like (+num).toString(36). For example the number 48346942 will then be "ss8qm" which is (including the quotes) 1 character less. It is possible that the addition of the quotes will actually add to the character count. So the larger the number the better the payoff. To convert it back you would do something like parseInt("ss8qm", 36).
If you are storing an object with any key that will repeat it's best to create a lookup object where you assign a shortened key to the original. So, for the sake of example, if you have:
Then you could make it:
Again, this pays off with size. And repetition. In my case it worked really well. But it depends on the subject.
All of these require a function to shrink and one to expand back out.
Also, I would recommend creating a class that is used to store & retrieve data from local storage. I ran into there not being enough space. So the writes would fail. Other sites may also write to local storage which can take away some of that space. See this post for more details.
What I did, in the class I built, was first attempt to remove any item with the given key. Then attempt the setItem. These two lines are wrapped with a try catch. If it fails then it assumes the storage is full. It will then clear everything in localStorage in an attempt to make room for it. It will then, after the clear, attempt to setItem again. This, too, is wrapped in a try catch. Since it may fail if the string itself is larger than what localStorage can handle.
EDIT: Also, you will come across the LZW compression a lot of people mention. I had implemented that, and it worked for small strings. But with large strings it would begin using invalid characters which resulted in corrupt data. So just be careful, and if you go in that direction test test test
此 Stackoverflow 问题有一个可能有帮助的答案。有一个 JavaScript 压缩库的链接。
This Stackoverflow Question has an answer that may help. There is a link to a JavaScript compression library.
JavaScript 的 Base64 压缩在此博客中有很好的解释。使用整个此处提供 ="https://github.com/venalis/VNLS" rel="nofollow">框架。
Base64 compression for javascript is very well explained at this blog. Implementation is also available here when using entire framework.