LevelDB 中可以存储整数键/值吗?
我搜索了支持整数键和整数值的键值存储。 LevelDB 似乎是一个不错的选择,尽管我找不到任何有关是否支持整数值/键的信息
I have searched for key value stores that support integer keys and integer values. LevelDB seems a good option, though I can't find any information on whether integer values/keys are supported
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
您可以在 LevelDB 中存储几乎任何内容。您通过
切片
结构。这是一个例子:差不多就是这样了!
但是,需要注意的一件事是,虽然通常可以在 LevelDB 中存储整数(作为键和值),但它们将通过 BytewiseComparator 进行排序,因此您的键具有支持字节比较。这也意味着,如果您依赖于键的特定顺序,那么您必须注意系统上的字节序。
您还可以通过
Comparator
接口,它允许您替换默认的
BytewiseComparator
。You can store pretty much anything in LevelDB. You provide opaque slices of data into LevelDB via the
Slice
structure. Here is an example:And that's pretty much it!
However, one thing to note is that while it's generally fine to store integers in LevelDB (as both keys and values), they will be order via the
BytewiseComparator
so your key has to support bytewise comparison. This also means that if you rely on specific ordering of the keys, then you have to be mindful of the endian-ness on the system.You can also write your own comparator via the
Comparator
interface which will allow you to replace the defaultBytewiseComparator
.在许多情况下,更复杂的整数键编码方案是更好的选择。将 int 打包到 char* 中的两个补码表示中(如该问题的另一个答案中所建议的)是一种选择; varint 编码是另一种编码(为小整数节省空间,可以存储没有上限的任意数字)。
In many cases a more elaborate encoding scheme for integer keys is a better choice. Packing an int into its two-complement representation in a char* (as suggested in another answer to this question) is one option; varint encoding is another one (saves space for small integers, can store arbitrary numbers without an upper bound).
为了放大 Link 的答案,部分原因是我刚刚在我正在写的书中研究了这个确切的事情,您可以看到他/她在下面讨论的 BytewiseComparator 结果。
另一种方法是将二进制整数翻转为大端格式,以便它们可以使用默认比较器进行排序。这使得组合键变得更加容易。
long FlippedI = htonl(i);
请注意,LevelDB 非常快。我在 iPhone4 上进行了测试,其中包含 50,000 个带有辅助键的文本键控记录,大约有 100,000 个键/值对,而且效果很好。
编写一个自定义比较器非常容易,该比较器将永远由您的数据库使用,并且仍然对除数字之外的键使用 ByteWiseComparator。最大的问题是决定自定义规则是否涵盖哪些键。
一种简单的方法是,所有非整数密钥的长度都超过 4 个字符,因此您假设 4 字节密钥是一个整数。这意味着您只需要确保添加尾随空格或其他内容来推动它。这一切都是非常随意的,取决于你,但请记住你仅有的两条信息是关键内容及其长度。给定密钥没有其他元数据。
标准比较器示例的部分结果,其 int 键从 1 开始,递增 1 到 1000,使用具有标准 BytewiseComparator 的数据库
To enlarge on Link's answer, partly because I've just been playing with this exact thing as part of the book I'm writing, you can see the BytewiseComparator results he/she talks about below.
Another approach is to flip your binary integers to big endian format so they will sort OK with the default comparator. This makes it easier to compose keys.
long flippedI = htonl(i);
Note that LevelDB is very fast. I've done tests on an iPhone4 with 50,000 text-keyed records with secondary keys, so about 100,000 key/value pairs and it screams along.
It is very easy to write a custom Comparator which is used by your database forevermore and still uses ByteWiseComparator for keys other than your numbers. The biggest issue is deciding which keys are covered by your custom rules or not.
A trivial way would be to say that all non-integer keys are more than 4 characters long so you assume a 4 byte key is an integer. That means you just need to ensure you add trailing spaces or something else to push that. It's all very arbitrary and up to you but remember the only two pieces of information you have are the key content and its length. There's no other metadata for a given key.
Part of the results from a sample for standard comparator with int keys starting at 1 and going up by 1 to 1000, using a database with standard BytewiseComparator
LMDB 对整数键(和值,如果您使用排序的重复项)有明确的支持。 http://symas.com/mdb
当 DB 配置为整数键时,键比较功能也很多更快,因为它们可以一次比较一个字,而不是像默认的面向字符串的比较那样一次只能比较一个字节。
免责声明:我是 LMDB 的作者。当然,这并不会让事实有什么不同。
LMDB has explicit support for integer keys (and values, if you're using sorted duplicates). http://symas.com/mdb
When a DB is configured for integer keys, the key comparison functions are also much faster since they can compare word-at-a-time instead of just byte-at-a-time as the default string-oriented compare does.
Disclaimer: I am the author of LMDB. Of course, that doesn't make the facts any different.