如何实施 B+文件系统树?
我有一个文本文件,其中包含有关文件系统中所有文件的范围的一些信息,如下所示 C:\Program Files\abcd.txt 12345 100 23456 200 C:\Program Files\bcde.txt 56789 50 26746 300 ...
现在我有另一个二进制文件,它试图找出所有文件的范围。 现在,我正在使用线性搜索来查找上述文本文件中文件的范围信息。这是一个耗时的过程。有更好的编码方法吗?就像实现任何好的数据结构(如 BTree)一样。如果使用B+树,我需要使用的关键、分支因子是什么?
I have a text file which contains some info on extents about all the files in the file system, like below
C:\Program Files\abcd.txt
12345 100
23456 200
C:\Program Files\bcde.txt
56789 50
26746 300
...
Now i have another binary which tries to find out about extents for all the files.
Now currently i am using linear search to find extent info for the files in the above mentioned text file. This is a time consuming process. Is there a better way of coding this ? Like Implementing any good data structure like BTree. If B+ Tree is used what is the key, branch factor i need to use ?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
使用数据库。
在文件中实现树的关键点是具有固定的记录长度并使用文件偏移量而不是指针。
使用数据库。 嗯,
SQL Lite
。文件需要考虑的另一点是,读取数据块比读取单个项目更快(无论硬盘是否有缓存或操作系统是否有缓存)。我实现了一个 B+Tree,它使用页面作为节点。
使用数据库。数据库已经编写并经过测试。
更有效的设计是将初始节点保留在内存中。这减少了从文件中获取的次数。如果您的程序有空间,将前几个级别保留在内存中也可以加快执行速度。
使用数据库。
我放弃了为我的应用程序编写 B 树实现,因为我想专注于程序的其他功能。我后来了解到,在现实世界(程序需要按计划完成的世界)中,时间应该花在应用程序的“核心”上,而不是已经编写和测试的附件上(又名现成的)架子)。
Use a database.
The key points in implementing a tree in a file are to have fixed record lengths and to use file offsets instead of pointers.
Use a database. Hmmm,
SQL Lite
.Another point to consider with files is that reading in chunks of data is faster than reading individual items (regardless of whether or not the hard disk has a cache or the OS has a cache). I implemented a B+Tree, which uses pages as it's nodes.
Use a database. Databases have already been written and tested.
A more efficient design is to keep the initial node in memory. This reduces the number of fetches from the file. If your program has the space, keeping the first couple of levels in memory may also speed up execution.
Use a database.
I gave up writing a B-Tree implementation for my application because I wanted to concentrate on the other functionality of the program. I later learned that in the real world (the world where programs need to be finished on a schedule) that time should be spent on the 'core' of the application rather than accessories that have already been written and tested (a.k.a. off-the-shelf).
这取决于您想要如何搜索文件。我假设您想根据给定的文件名查找信息。那么哈希表或 Trie 将是一个很好的数据结构。
B 树是可能的,但不是最方便的选择,因为您的键是字符串。
It depends on how do you want to search your file. I assume that you want to look up your info given a file name. Then a hash table or a Trie would be a good data structure to use.
The B-tree is possible but not the most convenient choice given that your keys are strings.