用于存储大型文档的数据库
谁能建议一个数据库解决方案来存储具有多个分支修订的大型文档?应该可以对内容进行部分编辑,而无需更新整个文档。
我正在查看 XML 数据库并想知道它们的适用性,或者甚至可能使用 DVCS(如 Mercurial)。
它最好具有 Python 绑定。
Can anyone suggest a database solution for storing large documents which will have multiple branched revisions? Partial edits of content should be possible without having to update the entire document.
I was looking at XML databases and wondering about the suitability of them, or maybe even using a DVCS (like Mercurial).
It should preferably have Python bindings.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
尝试Fossil——它有一个很好的增量编码算法,并保留所有版本。它由单个 SQLite 数据库支持,并具有基于 Web 和命令行的 UI。
Try Fossil -- it has a good delta encoding algorithm, and keeps all versions. It's backed by a single SQLite database, and has both a web based and a command line UI.
这取决于您的存储行为和用例。如果您计划存储大量“文档修订”并保留历史版本,并且可以遵守一次写入多次读取模式,那么您应该考虑 Hadoop HDFS 之类的东西。这需要大量(廉价)基础设施来运行集群,但您将能够随着时间的推移不断添加修订/数据,并且能够使用 MapReduce 算法快速查找它。
This depends on your storage behavior and use case. If you plan to store a massive number of "document revisions" and keep historical versions, and can comply with a write-once-read-many pattern, you should look into something like Hadoop HDFS. This requires a lot of (cheap) infrastructure to run your cluster, but you will be able to keep adding revisions/data over time and will be able to quickly look it up using a MapReduce algorithm.