关于我的数据存储问题的意见(数据库/自制解决方案)
我有非常简单的结构化数据,目前以自制文件格式存储,但我想知道我们是否应该迁移到更现代的格式。数据只是一个由 double
组成的表,由 double
列索引。我需要执行的事情是:
- 迭代表。
- 插入和删除任意记录。
- 选择给定键值之前和之后的给定行数(其中键可能不在数据库中)。
要求是:
- 存储必须是基于文件的,没有服务器。
- 没有必要将整个文件读入内存。
- 生成的文件应该可以在不同体系结构之间移植(wrt endian-ness...)
- 必须是一个非常稳定的项目(数据非常关键)。
- 必须在 Solaris/SPARC 上运行,最好也在 Linux/x64 上运行。
- 访问时间应该尽可能快。
- 必须作为 C++ 库提供。 Fortran 和 Python 绑定的奖励点:)
- 可选的比双精度更高精度的数字表示将是一个奖励。
- 相对紧凑的存储空间也是一个好处。
根据我有限的经验,sqlite 将是一个有趣的选择,或者如果 sqlite 不够快,则可能是非服务器模式下的mysql。但也许成熟的 SQL 数据库有点大材小用了?
你有什么建议?
I have very simply structured data which is currently stored in a home-brew file format, but I am wondering whether we should migrate to something more modern. The data is simply a table of double
s, indexed by a double
column. The things I need to perform are:
- Iterating through the table.
- Insertion and deletion of arbitrary records.
- Selecting a given number of rows before and after a given key value (where the key might not be in the database).
The requirements are:
- The storage must be file-based without a server.
- It should not be necessary to read the whole file into memory.
- The resulting file should be portable between different architectures (wrt endian-ness...)
- Must be a very stable project (the data is highly critical).
- Must run on Solaris/SPARC and preferably also on Linux/x64.
- Access times should be as fast as possible.
- Must be available as a C++ library. Bonus points for Fortran and Python bindings :)
- Optional higher precision number representation than double precision would be a bonus.
- Relatively compact storage size would also be a bonus.
From my limited experience, sqlite would be an interesting choice, or perhaps mysql in a non-server mode if sqlite is not fast enough. But perhaps a full-fledged SQL database is overkill?
What do you suggest?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
SQLite 几乎可以满足您的所有要求,而且使用起来并不难。尝试一下!
它是基于文件的,整个数据库是一个文件。
它不需要将整个文件读入内存。数据库大小可能受到限制;您应该检查此处如果限制在您的情况下会成为问题。
格式为跨平台:
<块引用>
SQLite 数据库可以跨 32 位和 64 位机器以及大端和小端架构之间移植。
它已经存在很长时间并且在很多地方使用,通常被认为是成熟和稳定的。
它非常便携,可以在 Solaris/SPARC 和 Linux/x64 上运行。
它比MySQL更快(尽管该链接后面存在盐粒)或其他类似的东西数据库服务器,因为只需要考虑一个客户端。
有一个 C++ API 和 Python 绑定 和 Fortran 包装器。
没有任意精度的列类型,但是
NUMERIC
如果无法精确表示,将会自动转换为文本:<块引用>
对于 TEXT 和 REAL 存储类之间的转换,如果保留数字的前 15 位有效十进制数字,则 SQLite 认为转换是无损且可逆的。如果无法将 TEXT 无损转换为 INTEGER 或 REAL,则使用 TEXT 存储类存储该值。
数据库的紧凑存储,我不确定。但我从未听说过任何关于 SQLite 会特别浪费的说法。
SQLite meets nearly all of your requirements, and it's not that hard to use. Give it a try!
It's file-based, and the entire database is a single file.
It does not need to read the entire file into memory. Database size might be limited; you should check here if the limits will be a problem in your situation.
The format is cross-platform:
It's been around for a long time and is used in many places, and is generally considered mature and stable.
It's very portable and runs on Solaris/SPARC and Linux/x64.
It's faster than MySQL (grains of salt present behind that link, though) or other such database servers, because only one client needs to be taken into account.
There is a C++ API and a Python binding and a Fortran wrapper.
There is no arbitrary-precision column type, but
NUMERIC
will be silently converted to text if it cannot be exactly represented:Compact storage of the database, I'm not sure of. But I've never heard any claims that SQLite would be particularly wasteful.