FILESYSTEM 与 SQLITE,同时存储最多 10M 文件

发布于 2024-09-25 02:32:40 字数 375 浏览 5 评论 0原文

我想存储最多10M的文件,2TB的存储单元。我需要的唯一属性仅限于文件名及其内容(数据)。

文件最大长度为100MB,大多数小于1MB。需要删除文件的能力,并且写入和读取速度都应该是优先考虑的——而不需要低存储效率、恢复或完整性方法。

我考虑过 NTFS,但它的大部分功能都是不需要的,同时也不能禁用并且被认为是一个开销问题,其中一些功能是:创建日期、修改日期、属性、日志,当然还有权限。

由于不需要文件系统的本机功能,您是否建议我使用 SQLITE 来满足此要求?或者有一个我应该注意的明显缺点? (有人会猜测删除文件将是一项复杂的任务?)

(SQLITE 将通过 C api)

我的目标是使用更合适的解决方案来获得性能。提前致谢 - Doori 酒吧

I would like to store up-to 10M files, 2TB storage unit. The only properties which I need restricted to filenames, and their contents (data).

The files max-length is 100MB, most of them are less than 1MB. The ability of removing files is required, and both writing and reading speeds should be a priority - while low storage efficiency, recovery or integrity methods, are not needed.

I thought about NTFS, but most of its features are not needed, while can't be disabled and considered to be an overhead concern, a few of them are: creation date, modification date, attribs, journal and of course permissions.

Due to the native features of a filesystem which are not needed, would you suggest I'll use SQLITE for this requirement? or there's an obvious disadvantage that I should be aware about? (one would guess that removing files will be a complicated task?)

(SQLITE will be via the C api)

My goal is to use a more suited solution to gain performance. Thanks in advance - Doori Bar

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

鲜肉鲜肉永远不皱 2024-10-02 02:32:40

官方 SQLite 站点实际上包含一个页面,该页面记录了使用数据库相对于本机文件系统的性能优势在各种操作系统中。存储约 10 KiB 的文件时,sqlite 速度大约快 35%。

SQLite 读取和写入小 blob(例如缩略图)
比读取或写入相同 blob 快 35%1
使用 fread() 或 fwrite() 在磁盘上存储单个文件。

此外,保存 10 KB blob 的单个 SQLite 数据库使用
与将 blob 存储在单个文件中相比,磁盘空间大约减少 20%。

性能差异的出现(我们相信)是因为工作时
对于 SQLite 数据库,open() 和 close() 系统调用是
仅调用一次,而 open() 和 close() 则调用一次
使用存储在单独文件中的 blob 时的每个 blob。看来
调用 open() 和 close() 的开销大于
使用数据库的开销。尺寸减小源于
事实上,单个文件被填充到下一个倍数
文件系统块大小,而 blob 更紧密地打包到
SQLite 数据库。

本文中的测量是在本周进行的
2017-06-05 使用 3.19.2 和 3.20.0 之间的 SQLite 版本。你
预计 SQLite 的未来版本会表现得更好。

使用较大的文件时,您可能会遇到不同的结果,SQLite 站点包含一个指向 kvtest< 的链接/a> 您可以使用它在您自己的硬件/操作系统上重现这些结果。

The official SQLite site actually includes a page which documents the performance benefits of using a database over a native filesystem in various operating systems. When storing files of ~ 10 KiB sqlite is approximately 35% faster.

SQLite reads and writes small blobs (for example, thumbnail images)
35% faster¹ than the same blobs can be read from or written to
individual files on disk using fread() or fwrite().

Furthermore, a single SQLite database holding 10-kilobyte blobs uses
about 20% less disk space than storing the blobs in individual files.

The performance difference arises (we believe) because when working
from an SQLite database, the open() and close() system calls are
invoked only once, whereas open() and close() are invoked once for
each blob when using blobs stored in individual files. It appears that
the overhead of calling open() and close() is greater than the
overhead of using the database. The size reduction arises from the
fact that individual files are padded out to the next multiple of the
filesystem block size, whereas the blobs are packed more tightly into
an SQLite database.

The measurements in this article were made during the week of
2017-06-05 using a version of SQLite in between 3.19.2 and 3.20.0. You
may expect future versions of SQLite to perform even better.

You may experience different results when using larger files, and SQLite site includes a link to kvtest which you may use to reproduce these results on your own hardware / operating system.

梦毁影碎の 2024-10-02 02:32:40

如果您的主要要求是性能,请使用本机文件系统。 DBMS 不太适合处理大型 BLOB,因此 SQLite 根本不适合您(甚至不知道为什么每个人都认为 SQLite 是每个漏洞的塞子)。

为了提高 NTFS(或您选择的任何其他文件系统)的性能,请勿将所有文件放入单个文件夹中,而是按文件名的前 N ​​个字符或扩展名对文件进行分组。

市场上还存在一些其他文件系统,也许其中一些提供了禁用某些使用功能的可能性。您可以查看维基百科上的比较并进行检查。

更正:我做了一些测试(虽然不是很广泛),结果表明对于大多数类型的操作,将文件分组到子目录中没有性能优势,并且 NTFS 非常有效地处理以 AAAA 命名的 26^4 个空文件到单个目录中的 ZZZZ。因此,您需要检查特定文件系统的效率。

If your main requirement is performance, go with native file system. DBMS are not well suited for handling large BLOBs, so SQLite is not an option for you at all (don't even know why everybody considers SQLite to be a plug for every hole).

To improve performance of NTFS (or any other file system you choose) don't put all files into single folder, but group files by first N characters of their file names, or also by extension.

Also there exist some other file systems on the market and maybe some of them offer possibility to disable some of used features. You can check the comparison on Wikipedia and check them.

Correction: I've made some tests (not very extensive though) that show no performance benefit in grouping files into subdirectories for most types of operations, and NTFS quite efficiently handled 26^4 empty files named from AAAA to ZZZZ in a single directory. So you need to check efficiency for your particular file system.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文