只提供读取速度非常快的键值数据库?

发布于 2024-10-19 07:52:16 字数 407 浏览 2 评论 0原文

我们正在 Scala 中做一个项目,我需要一个键值数据库(如地图),主要用于服务读取操作并且速度非常快:

  • 没有任何外来查询支持或复杂的检索逻辑,只需提供键并获取值,就像地图一样。没有条件,没有连接,什么都没有。按键->值
  • 顺便说一句,值本身就是一些字符串或类似内容的列表的映射。这意味着它有点长(如果有的话)
  • 我们只是将其用于阅读。没有编写期望数据库的初始填充或一些非常罕见的更新,或者可能可以在数据库之外处理......

我已经转向 MangoDB 和 memcachedDB 但 Mango 擅长查询(这对我没有增加任何价值) )而 memcacheDB 完全与分发有关(在我的项目中不是问题)。到目前为止,我正在考虑利用 RDBMS(例如 MySQL),但也许在 NoSQL 领域有更好的选择?

We are doing a project in Scala and I need a key-value database (like a map) that is required to mainly serve read operations and do so really fast:

  • No exotic query support or complex retrieval logic in any ways, just give the key and get the value, just like a map. no conditions, no joins, nothing. Key -> Value
  • The value ,by the way, is itself a map of some list of some strings or something like that. meaning it's a little lengthy (if matters at all)
  • We use it just for reading. No writing expect for the initial populating of the db or some very rare updates or perhaps that can be handled outside of the db ...

I've been directed towards MangoDB and memcachedDB but Mango is good at queries (which adds no value to me) and memcacheDB is all about distribution (not a concern in my project). So far I'm thinking of leveraging a RDBMS (e.g MySQL) but perhaps there are better options in the land of NoSQL ?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(7

朕就是辣么酷 2024-10-26 07:52:16

另一种选择是使用平面文件,数据听起来相对简单,并且您不必经常写入文件。似乎还有一个 memcached 的开源 scala 实现。这种访问速度会非常快。 https://github.com/victori/smemcached

An alternative would be just to use a flatfile, the data sounds relatively simple and you don't have to write to the file often. Seems that there is also a open source scala implementation of memcached. This access would be very fast. https://github.com/victori/smemcached

完美的未来在梦里 2024-10-26 07:52:16

我建议 SQLiteBerkeley DB(具有与 SQLite 兼容的 SQL API)。两者都是简单的嵌入式数据库库——它们链接到您的应用程序,因此不需要单独的服务器。它们运行查询的速度都非常快。 Berkeley DB 对于超大型数据库具有更好的可扩展性。如果您有兴趣使用键值对 API (NoSQL),Berkeley DB 也有该 API。

祝您搜索顺利。

I would suggest SQLite or Berkeley DB (which has a SQLite-compatible SQL API). Both are simple, embedded database libraries -- they link into your application, so there is no requirement for a separate server. They are both very fast at running queries. Berkeley DB has better scalability for very large databases. If you're interested in using a key-value pair API (NoSQL), Berkeley DB has that API as well.

Good luck in your search.

甚是思念 2024-10-26 07:52:16

我建议您看看京都内阁。我正在围绕它编写一些 Scala 包装器,允许您将它作为普通的旧式 Scala 地图进行访问。我自己还没有做过基准测试,但根据现有的基准测试,它比 Berkeley DB 更快。 (但是,现在说还为时过早,因为没有关于 Java 集成开销的文档。)

检查 此处为 JavaDoc API。我一直在 REPL 上摆弄它,效果很好。

以下是来自 REPL 的一些证明它有效的证据:

$ scala -Djava.library.path=/usr/local/lib
Welcome to Scala version 2.8.0.final (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_15).
Type in expressions to have them evaluated.
Type :help for more information.

scala> :cp /Users/wilfred/.m2/repository/com/fallabs/kyotocabinet/1.15/kyotocabinet-1.15.jar
Added '/Users/wilfred/.m2/repository/com/fallabs/kyotocabinet/1.15/kyotocabinet-1.15.jar'.  Your new classpath is:
.:/Users/wilfred/.m2/repository/com/fallabs/kyotocabinet/1.15/kyotocabinet-1.15.jar

scala> import kyotocabinet._                                                                
import kyotocabinet._

scala> val db = new DB()                                                                    
db: kyotocabinet.DB = (null): -1: -1

scala> db.open("casket.kch", DB.OWRITER | DB.OCREATE)
res0: Boolean = true

scala> db.set("foo", "bar")
res1: Boolean = true

scala> db.get("foo")
res2: java.lang.String = bar

I would suggest you take a look at Kyoto Cabinet. I'm in the process of writing some Scala wrappers around it, allowing you to access it as a plain old vanilla Scala Map. I haven't done a benchmark myself yet, but according to the benchmarks out there, it's faster than Berkeley DB. (However, it may be to early to tell, since there is no documentation on the overhead of the Java integration.)

Check the JavaDoc APIs here. I have been toying with it on the REPL, and it worked fine.

Here's some proof from the REPL that it works:

$ scala -Djava.library.path=/usr/local/lib
Welcome to Scala version 2.8.0.final (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_15).
Type in expressions to have them evaluated.
Type :help for more information.

scala> :cp /Users/wilfred/.m2/repository/com/fallabs/kyotocabinet/1.15/kyotocabinet-1.15.jar
Added '/Users/wilfred/.m2/repository/com/fallabs/kyotocabinet/1.15/kyotocabinet-1.15.jar'.  Your new classpath is:
.:/Users/wilfred/.m2/repository/com/fallabs/kyotocabinet/1.15/kyotocabinet-1.15.jar

scala> import kyotocabinet._                                                                
import kyotocabinet._

scala> val db = new DB()                                                                    
db: kyotocabinet.DB = (null): -1: -1

scala> db.open("casket.kch", DB.OWRITER | DB.OCREATE)
res0: Boolean = true

scala> db.set("foo", "bar")
res1: Boolean = true

scala> db.get("foo")
res2: java.lang.String = bar
吾性傲以野 2024-10-26 07:52:16

Chronicle Map 是一个纯 Java 可嵌入的持久键值存储。

PalDB一次写入,用于 Java 的可嵌入、持久键值存储

Chronicle Map is a pure Java embeddable, persistent key-value store.

PalDB is a write-once, embeddable, persistent key-value store for Java

对你再特殊 2024-10-26 07:52:16

MongoDB 可能是一个简单的解决方案。

http://www.mongodb.org/display/DOCS/Benchmarks

MongoDB would probably be an easy solution for this.

http://www.mongodb.org/display/DOCS/Benchmarks

峩卟喜欢 2024-10-26 07:52:16

MemcacheDB 听起来像是适合这项工作的工具,即使您不需要分布式网络部分(您无需执行任何操作即可使用它)。

更好的是,redis 应该非常快,并且还具有对存储列表或集合等数据结构的本机支持。

MemcacheDB sounds like the right tool for the job, even if you do not need the distributed networking part (you do not have to do anything not to use it).

Even better, redis is supposed to be very fast and also has native support for storing data structures like lists or sets.

镜花水月 2024-10-26 07:52:16

我推荐 CDB(恒定数据库)。它有一些优点:

  • 快速查找:大型数据库中的成功查找通常只需要两次磁盘访问。一次不成功的查找只需要一次。
  • 低开销:数据库使用 2048 字节,加上每条记录 24 字节,再加上键和数据的空间。
  • 无随机限制:cdb 可以处理最大 4 GB 的任何数据库。没有其他限制;记录甚至不必装入内存。数据库以独立于机器的格式存储。
  • 快速原子数据库替换:cdbmake 重写整个数据库的速度比其他哈希包快两个数量级。
  • 快速数据库转储:cdbdump 以 cdbmake 兼容的格式打印数据库的内容。

唯一的问题是它仅限于 4GB 数据库大小。
如果您需要更多数据,可以使用 64 位版本(Go cdb64 或 Python python-pure-cdb) 可以读取最大 16 艾字节的数据库文件。

I'd recommend CDB (Constant Data Base). It has a few advantages:

  • Fast lookups: A successful lookup in a large database normally takes just two disk accesses. An unsuccessful lookup takes only one.
  • Low overhead: A database uses 2048 bytes, plus 24 bytes per record, plus the space for keys and data.
  • No random limits: cdb can handle any database up to 4 gigabytes. There are no other restrictions; records don't even have to fit into memory. Databases are stored in a machine-independent format.
  • Fast atomic database replacement: cdbmake can rewrite an entire database two orders of magnitude faster than other hashing packages.
  • Fast database dumps: cdbdump prints the contents of a database in cdbmake-compatible format.

The only problem is that it's limited to 4GB database sizes.
If you need more data, there's a 64 bit version (in Go cdb64 or in Python python-pure-cdb) that can read database files up to 16 exabytes.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文