在单个 CouchDB 文档中存储大量附件

发布于 2024-12-18 13:06:20 字数 333 浏览 7 评论 0原文

tl;dr :我应该将目录存储在 CouchDB 中作为附件列表,还是单个 tar

我一直在使用 CouchDB 来存储项目文档。我只是通过 Futon 创建文档并直接从那里上传。我还编写了一个脚本来批量上传目录。我将它用作基本内容存储库。我复制它,以便我团队中的其他人拥有该存储库的副本。

我注意到将目录保存为一系列文件似乎会产生大量存储开销,因此我上传了包含该目录的 .tar.gz 文件。这确实显着减小了文档的大小,但现在任何对目录的更改都需要复制整个 tarball。

我正在寻找对此事的想法或观点。

tl;dr : Should I store directories in CouchDB as a list of attachments, or a single tar

I've been using CouchDB to store project documents. I just create documents via Futon and upload them directly from there. I've also written a script to bulk-upload directories. I am using it like a basic content repository. I replicate it, so other people on my team have a copy of the repository.

I noticed that saving directories as a series of files seems to have a lot of storage overhead, so instead I upload a .tar.gz file containing the directory. This does significantly reduce the size of the document but now any change to the directory requires replicating the entire tarball.

I am looking for thoughts or perspective on the matter.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

桃扇骨 2024-12-25 13:06:20

这实际上取决于您想要实现的目标。我将尝试提供一些选项供您考虑。

存储一个 tar.gz 可以节省空间,但它确实会增加使用的难度。如果您只是归档,它可能适合您。

将所有附件存储在一份文档中对于沙发应用程序来说效果很好。工作流程是在准备好发布应用程序之前处理附件,然后复制就不会产生太多开销,因为它通常是一次性的。很高兴它们是一个文档,因为它们都作为一个包移动/复制。在内容管理系统中使用这种方法的缺点是,您可能会获得大量历史包袱,而必须将其压缩在本地沙发上。此外,在沙发之间的复制过程中,您会遇到很多冲突,沙发会保留冲突供您解决。因此,如果您选择此模型,您应该经常压缩以减少磁盘大小。

对于内容管理系统,我可能建议每个附件使用一个文档。这会让你的冲突减少。由于每个文档都会为文档本身分配一些空间,因此会产生轻微的开销,但由于必须进行频繁的压缩和/或冲突解决,所以节省的费用会更好。

希望这能为您提供一些权衡选择。

It really depends one what you want to achieve. I will try and provide some options for you to consider.

Storing one tar.gz will save you space, but it does make it harder to work with. If you are simply archiving it may work for you.

Storing all the attachments on one document works well for couchapps. The workflow is you mess around with attachments until you are ready to release the application, then there is not a lot of overhead for replication, because it is usually one time. It is nice that they are one one document because they all move/replicate as one bundle. Downsides for using this approach for a content management system are that you can get a lot of history baggage that you have to compact on your local couch. Also you will get a lot of conflicts during replication between couches, and couch will keep conflicts around for you to resolve. Therefore if you choose this model, you should compact frequently to reduce disk size.

For a content management system, I might recommend using one document per attachment. That would give you less conflicts. There will be a slight overhead as each doc will have some space allocated for the doc itself, but the savings in having to do frequent compaction and/or conflict resolution will be better.

Hope that gives you some options to weigh out.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文