大量半持久数据的结构?

发布于 2024-11-18 09:25:24 字数 517 浏览 3 评论 0原文

我需要跟踪一组文件的大量 inotify 消息,这些文件在其生命周期内将在几个特定目录之间移动,并且 inode 完好无损;我需要跟踪这些索引节点的移动,以及创建/删除和更改文件内容。每秒将会有数百个更改。

由于资源有限,我无法将其全部存储在 RAM(或磁盘,或数据库)中。

幸运的是,大部分文件很快就会被删除。只需存储文件内容和移动历史记录以供以后分析即可。未立即删除的文件最终将在特定目录中保留一段已知的时间。

所以在我看来,我需要一个部分存储在 RAM 中、部分保存到磁盘中的数据结构;保存到磁盘的部分内容需要被调用(文件不会被删除),但大多数不会。我不需要查询数据,只需通过标识符(文件名,[A-Z0-9]{8})访问它。如果能够配置文件数据何时刷新到磁盘,将会很有帮助。

这样的野兽存在吗?

编辑:我问了一个相关问题

I need to track a large volume of inotify messages for a set of files that, during their lifetime, will move between several specific directories, with inodes intact; I need to track the movement of these inodes, as well as create/delete and changes to a file's content. There will be many hundreds of changes per second.

Because of limited resources, I cant store it all in RAM (or disk, or a database).

Luckily, most of these files will be deleted in short order; the file content- and movement-history just need to be stored for later analysis. The files that are not deleted immediately will end up staying in a particular directory for a known period of time.

So it seems to me that I need a data structure that is partially stored in RAM, and partially saved to disk; part of the portion saved to disk will need to be recalled (the files not deleted), but most will not. I will not need to query the data, only access it by an identifier (the file name, which is [A-Z0-9]{8}). It would be helpful to be able to configure when the file data is flushed to disk.

Does such a beast exist?

Edit: I've asked a related question.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

冷情 2024-11-25 09:25:24

为什么不是数据库?说SQLite。

虽然就空间而言,SQLite 并不是最高效的存储机制,但它有许多优点 - 首先也是最重要的一点是它是一个 SQL RDBMS。 SQLite 使用的内存量(临时缓存数据)可以通过 cache_size pragma 配置。

如果 SQLite 不是一个选项,那么“键值存储” 之一怎么样?它们的范围从内存中的分布式客户端/服务器(例如memcached)到基于本地嵌入式磁盘(例如BDB)到带有持久性溢出支持的内存以及介于两者之间的任何地方等。 SQL DDL/DQL(尽管有些可能允许关系),但它们的工作非常高效——存储键和值。

当然,人们总是可以实现一个 LRU 结构(比如一个有限制的基本排序列表),并溢出到一个简单的可扩展的基于磁盘的哈希实现......但是......首先考虑上面的:) [也可能有一些微观的-KV 库/源在那里]。

快乐编码。

Why not a database? Say SQLite.

While SQLite isn't the most efficient storage mechanism in terms of space there are a number of advantages -- the first and foremost is that is an SQL RDBMS. The amount of memory SQLite uses (to temporarily cache data) can be configured through the cache_size pragma.

If SQLite isn't an option, what about one of the "key value stores"? They range from distributed client/server in-memory (e.g. memcached) to local embedded disk-based (e.g BDB) to memory-with-a-persistent-backing-for-overflow and anywhere in-between, etc. They do not have the SQL DDL/DQL (although some might allow relationships), but are efficient at what they do -- store keys and values.

Of course, one could always implement a LRU structure (say a basic sorted list with a limit) with overflow to a simple extensible disk-based hash implementation... but... consider above first :) [There may also be some micro-KV libraries/source out there].

Happy coding.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文