用于可变长度记录存储和仅在主键上搜索的磁盘查找的数据结构/算法

发布于 2024-07-22 05:32:24 字数 649 浏览 5 评论 0原文

我正在寻找一种适用于基于大型块的设备(例如机械硬盘驱动器)的算法/数据结构,该结构针对插入、获取、更新和删除进行了优化,其中搜索始终使用数据的 id 以及数据的位置进行任何 ID 的字段都有可变长度。

B-Tree 似乎是一种常用的结构,但主要用于固定长度的记录。 我还期望获取和更新比插入和删除要多得多。 我可以摆脱 B 树的 O(log m) 查找吗?

我很高兴它是一个组合系统,例如 ISAM 组合了 B 树和线性文件存储,看起来它可以作为一种方法与可变长度记录一起使用。 还有更好的吗?

一些进一步的限制:

1)ID 可能是稀疏的,但它们可以以线性数字块的形式出现 - 但范围很大(64 位)

2)我不想使用 DBMS,我的特定问题的性能还没有提高事实证明不是很好。 我不需要完整的 DBMS 使用的任何操作,我不需要搜索。 我需要一些可以轻松调整和优化的东西。 称之为学术好奇心,如果 MySQL 无法胜任,那么我会使用它,但我必须尝试更快。

3)数据集比内存大,但是如果索引像键、偏移量一样简单,那么索引可能很适合内存。 我当然会在存储中查看大约 10 亿个或更多的实体。

4) 理想情况下,删除记录时应恢复空间。 这可能是通过压缩实现的,但我有兴趣看看是否有更好的方法(例如,B 树可以轻松恢复空间)。

I am looking for an algorithm / data structure that works well for large block based devices (eg a mechanical hard drive) which is optimised for insert, get, update and delete where searches are always done using the id of the data and where the data fields for any ID have a variable length.

The B-Tree seems to be a commonly quoted structure but mainly for fixed length records. I also expect dramatically more gets and updates than I do inserts and deletes. Can I get rid of the O(log m) lookup of the B-tree?

I am quite happy for it to be a combined system, for instance ISAM combines a B-tree and linear file storage which looks like it can be made to work with variable length records as an approach. Is there something better?

Some further constraints:

1) IDs are potentially sparse but they can be made to come in blocks of linear numbers - but in a large range (64 bit)

2) I don't want to use a DBMS, performance for my particular problem hasn't proved very good. I don't need any of the operations a full DBMS uses, I don't need search. I need something I can tweak and optimise easily. Call it academic curiosity, if it is out performed by MySQL then I'll use that but I have to try and go faster.

3) The dataset is larger than can fit in memory, the index however may well fit into memory if its as simple as key, offset. I am certainly looking at something like 1 billion entities or more in storage.

4) Ideally space should be recovered when a record is deleted. That may be via compaction but I am interested to see if there is a better way (A B-tree for instance recovers space easily).

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

旧城烟雨 2024-07-29 05:32:24

简单的方法:使用 Berkeley DB 之类的东西。 它为任意字节字符串提供键值存储,并为您完成所有艰苦的工作。 如果您需要的话,它甚至还提供用于索引的“辅助数据库”。

DIY 方式:使用 Protocol Buffers(或您选择的二进制格式)来定义 B 树节点和数据项结构。 对数据库使用仅附加文件。 要写入新记录或修改现有记录,只需将记录本身写入文件末尾,然后写入任何修改的 B-Tree 节点(例如,记录的父节点、父节点) ,依此类推直到根)。 然后,将树的新根的位置写入文件开头的标头块。 要读取该文件,您只需找到最近的根节点并像读取任何其他文件一样读取 B 树。 这种方法有几个优点:

  • 由于写入的数据永远不会被修改,因此读取器不需要锁定,并根据开始读取时的根节点获得数据库的“快照”视图。
  • 通过将“先前版本”字段添加到节点和记录中,您基本上可以免费访问数据库的先前版本。
  • 与大多数支持修改的磁盘文件格式相比,它确实很容易实现和调试。
  • 压缩数据库只需读出最新版本的数据和 B-Tree 并将其写入新文件。

The easy way: Use something like Berkeley DB. It provides a key-value store for arbitrary byte strings, and does all the hard work for you. It even provides 'secondary databases' for indexing, if you want it.

The do-it-yourself way: Use Protocol Buffers (or the binary format of your choice) to define B-Tree node and data item structures. Use an append-only file for your database. To write a new record or modify an existing record, you simply write the record itself to the end of the file, then write any modified B-Tree nodes (eg, the record's parent node, its parent node, and so forth up to the root). Then, write the location of the new root of the tree to the header block at the beginning of the file. To read the file, you simply find the most recent root node and read the B-Tree like you would in any other file. This approach has several advantages:

  • Since written data is never modified, readers don't need to take locks, and get a 'snapshot' view of the DB based on the root node at the time they started reading.
  • By adding 'previous version' fields to your nodes and records, you get the ability to access previous versions of the DB essentially for free.
  • It's really easy to implement and debug compared to most on-disk file formats that support modification.
  • Compacting the database consists of simply reading out the latest version of the data and B-Tree and writing it to a new file.
地狱即天堂 2024-07-29 05:32:24

如果您的 ID 是数字并且不是很稀疏,一种选择是在一个文件中使用一个简单的(偏移量、长度)表,引用另一个文件中的数据。 这将使您的查找时间复杂度为 O(1),并且更新/插入/删除仅受您的自由空间跟踪机制的约束。

If your IDs are numbers and not very sparse, one option would be to use a simple table of (offset, length) in one file, referencing the data in another file. This would get you O(1) lookup, and updates/inserts/deletes bound only by your free-space tracking mechanism.

淡写薰衣草的香 2024-07-29 05:32:24

最好的办法是使用商业数据库引擎。

您可以通过将索引(即,{“逻辑 ID”映射到“物理位置”}值对)存储在哈希映射中(对逻辑 ID 进行哈希)来消除 B 树的任何 O(log m) 查找...或者,甚至按照 bdonlan 的建议,如果 ID 值不稀疏,则将索引存储在连续向量中(将逻辑 ID 用作偏移值向量的索引)。

一个重要的实现细节可能是您用来访问索引的 API:是否将其存储在 RAM(操作系统通过系统页面文件支持)中并使用指针在进程内访问它,和/或将其存储在磁盘(O/S 缓存在文件系统缓存中)并使用文件 I/O API 访问它。

Best might be to use a commercial database engine.

You might get rid of any O(log m) lookup of a B-tree by storing the index, i.e. the {"logical ID" maps to "physical location"} value pairs, in a hash map (hashing on the logical ID) ... or, even, storing the index in a contiguous vector (with the logical ID used as an index into the vector of offset values), as bdonlan suggested, if the ID values aren't sparse.

An impotant implementation detail might be what the API you use to access the index: whether you store it in RAM (which the O/S backs with the system page file) and access it in-process using pointers, and/or store it on disk (which the O/S caches in the file system cache) and access it using file I/O APIs.

萝莉病 2024-07-29 05:32:24

如果数据库对您来说太重,请考虑键值存储。

如果您确实要自己实现它,请使用基于磁盘的哈希表或 B 树。 为了避免可变长度值的问题,请将值存储在单独的文件中并使用 B 树作为数据文件的索引。 删除值后的空间回收将很棘手,但这是可能的(例如,通过数据文件中释放空间的位集)。

If a database is to heavy weight for you, consider a key-value store.

If you really what to implement it yourself, use a disk-based hash table or a B-tree. To avoid the problems with the variable-length values, store the values in an separate file and use the B-tree as index for the data file. Space-reclaimation after deletion of values will be tricky, but it is possible (e.g. by a bitset for freed space in the data file).

如果没有 2024-07-29 05:32:24

索引可变长度记录文件。

索引可变长度记录文件乍一看可能是一项艰巨的任务,但一旦您确定了哪些是移动部件,它实际上就非常简单了。

为此,您应该以固定大小的块(即 128、256 或 512 等)读取文件。 您的记录还应该具有易于识别的记录结束字符。 这意味着该字符不能作为常规字符出现在您的记录中。

接下来,您扫描文件,查找每条记录的开头,创建具有以下结构的索引文件:

key, 0, 0
........
........
key, block, offset

这里是您要为文件建立索引的键(字段) on(可以是复合的)。 block 是记录开始的(0-based)块号,offset 是(0-based) >) 记录开头相对于块开头的偏移量。 根据您使用的块大小,您的记录可能跨越多个块。因此,一旦找到记录的开头,您就需要根据需要获取尽可能多的连续块检索整个记录。

如果您需要搜索不同的条件,完全可以同时创建多个索引文件。

创建此索引文件后,下一步是按关键字段对其进行排序。 或者,您可以设置插入排序机制来保留索引文件按照创建时的顺序排序

使用类似文件查找检索按该键排序的记录 函数,在索引文件上查找键并检索其记录偏移量对。 二分搜索在这种情况下似乎表现得很好,但你可以使用任何类型。

这种结构允许您的数据库接受记录添加删除更新。 在文件末尾进行添加,将其密钥添加到索引文件中。 要删除记录,只需将记录的第一个字符更改为唯一字符(例如0x0),然后从索引文件中删除该条目即可。 可以通过删除然后在文件末尾添加更新的记录来实现更新。

如果您打算处理非常大的文件,则使用 B-Tree 类似的索引结构可能会有很大帮助,因为 B-Tree 索引不需要在内存中完整加载。 B-Tree 算法进一步将索引文件划分为 页面然后根据需要加载到内存中。

Indexing variable-length-record files.

Indexing variable-length record files may look at first like a daunting task but it is really pretty straightforward once you identify which are the moving parts.

To do this you should read your file in blocks of fixed size (ie. 128, 256 or 512 etc). Your records also should have an easily identifiable end-of-record character. Meaning this character cannot appear as a regular character inside your records.

Next thing is you scan your file looking for the beginning of each record creating an index file with the following structure:

key, 0, 0
........
........
key, block, offset

Here key is the key (field) you are indexing your file on (can be a composite one). block is the (0-based) block number the record starts at, and offset is the (0-based) offset of the beginning of the record from the beginning of the block. Depending of the block size you use, your records may span more than one block. So once you locate the beginning of a record you need to fetch as many consecutive blocks as necessary to retrieve the whole record.

It is perfectly possible to create multiple index files at the same time if you need to search for different criteria.

Once you have created this index file, the next step is to sort it by the key field. Alternatively you can device an insertion sort mechanism which keeps your index file sorted as it is been created.

Retrieve your record sorted by that key using a file seek-like function, looking up the key on the index file and retrieving its record-offset pair. Binary search seems to perform pretty well in this scenario but you can use any type.

This structure allows your database to accept record additions, deletions and updates. Additions are made at the end of the file, adding its key to the index file. To delete a record, just change the first character of the record with a unique character like 0x0 and delete the entry from the index file. Updates can be achieved by deleting and then adding the updated record at the end of the file.

If you plan to deal with very large files then using a B-Tree like structure for your indexes may be of a great help as B-Trees indexes do not need to be loaded complete in memory. The B-Tree algorithm further divides the index file into pages which are then loaded in memory as needed.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文