当前位置：文江博客话题详情

在 PostgreSQL 中存储二进制文件的多个修订版本的最有效方法是什么？

发布于 2024-10-30 05:37:31 字数 253 浏览 1 评论 0 原文

我正在这里的数据库中寻找有限形式的版本控制：

许多修订应该占用尽可能小的空间（我不是在寻找压缩，因为数据已经被压缩）
大小是最重要的：同一文件的要求是次要的
我应该能够尽快获取文档的当前版本（获取旧版本不是时间关键）

基本上答案应该至少包含两件事：

您将使用什么二进制差异算法？
您将如何以 PostreSQL 特有的方式构建这个系统？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

负佳期 2024-11-06 05:37:31

“大小是最重要的”：使用 bsdiff？）怎么样？例如 href="http://plsh.projects.postgresql.org/" rel="nofollow">PL/sh 。

“我应该能够尽快获取文档的当前版本”：在这种情况下，您将希望以“错误”的方式进行比较，因此每个版本都将涉及：

将“先前版本”替换为之间的差异“新修订版”和“上一个修订版”
添加“新修订版”

要返回到旧修订版，则需要迭代地将以前的差异应用为补丁，直到获得所需的修订版。

无论你做什么，我认为在使用 diff 工具之前你需要先解压缩数据。原因如下：

dd if=/dev/urandom of=myfile.1 bs=1024 count=10
cp myfile.1 tmp; cat tmp >> myfile.1
cp myfile.1 tmp; cat tmp >> myfile.1
cp myfile.1 tmp; cat tmp >> myfile.1
cp myfile.1 tmp; cat tmp >> myfile.1
dd if=/dev/urandom of=myfile.2 bs=1024 count=10
cp myfile.2 tmp; cat tmp >> myfile.2
cp myfile.2 tmp; cat tmp >> myfile.2
cp myfile.2 tmp; cat tmp >> myfile.2
cp myfile.2 tmp; cat tmp >> myfile.2
cat myfile.1 >> myfile.2
bsdiff myfile.1 myfile.2 diff
gzip -c myfile.1 > myfile.1.gz
gzip -c myfile.2 > myfile.2.gz
bsdiff myfile.1.gz myfile.2.gz gz.diff
rm tmp
ls -l

-rw-r--r-- 1 root root  17115 2011-04-05 10:54 diff
-rw-r--r-- 1 root root  21580 2011-04-05 10:54 gz.diff
-rw-r--r-- 1 root root 163840 2011-04-05 10:54 myfile.1
-rw-r--r-- 1 root root  11709 2011-04-05 10:54 myfile.1.gz
-rw-r--r-- 1 root root 327680 2011-04-05 10:54 myfile.2
-rw-r--r-- 1 root root  23399 2011-04-05 10:54 myfile.2.gz

请注意，gz.diff 大于 diff - 如果您在真实文件中尝试此操作，我预计差异会更大。

"Size is of greatest importance": how about an external diff tool (like bsdiff?) using PL/sh for example.

"I should be able to fetch the current revision of the document as fast as possible": In which case you will want to do your diff the 'wrong' way round, so each revision would involve:

replace 'previous revision' with diff between 'new revision' and 'previous revision'
add 'new revision'

To get back to an old revision would then require iteratively applying previous diffs as patches until you get to the revision you need.

Whatever you do, I think you will need to uncompress the data first before using the diff tool. Here's why:

dd if=/dev/urandom of=myfile.1 bs=1024 count=10
cp myfile.1 tmp; cat tmp >> myfile.1
cp myfile.1 tmp; cat tmp >> myfile.1
cp myfile.1 tmp; cat tmp >> myfile.1
cp myfile.1 tmp; cat tmp >> myfile.1
dd if=/dev/urandom of=myfile.2 bs=1024 count=10
cp myfile.2 tmp; cat tmp >> myfile.2
cp myfile.2 tmp; cat tmp >> myfile.2
cp myfile.2 tmp; cat tmp >> myfile.2
cp myfile.2 tmp; cat tmp >> myfile.2
cat myfile.1 >> myfile.2
bsdiff myfile.1 myfile.2 diff
gzip -c myfile.1 > myfile.1.gz
gzip -c myfile.2 > myfile.2.gz
bsdiff myfile.1.gz myfile.2.gz gz.diff
rm tmp
ls -l

-rw-r--r-- 1 root root  17115 2011-04-05 10:54 diff
-rw-r--r-- 1 root root  21580 2011-04-05 10:54 gz.diff
-rw-r--r-- 1 root root 163840 2011-04-05 10:54 myfile.1
-rw-r--r-- 1 root root  11709 2011-04-05 10:54 myfile.1.gz
-rw-r--r-- 1 root root 327680 2011-04-05 10:54 myfile.2
-rw-r--r-- 1 root root  23399 2011-04-05 10:54 myfile.2.gz

Note that gz.diff is larger than diff - if you try this with real files I expect the difference to be even larger.

回复收藏 0 原文

半步萧音过轻尘 2024-11-06 05:37:31

我真的很不喜欢重新发明轮子。在存储空间优化方面，比我聪明得多的人已经找到了解决方案。如果可能的话，我更愿意利用这些真正聪明的人的辛勤工作。话虽如此，一旦我了解了它们如何存储二进制数据，我可能会考虑将我的文件存储在 Mercurial 或 Git 等版本控制系统中。一旦你弄清楚你想使用哪一个，你就可以看看如何在 pl/perl 或类似的函数中创建一些存储函数，这些函数可以与版本控制系统交互，并弥合 PostgreSQL 中的关系数据和二进制文件之间的差距。文件。

我对这种方法的唯一问题是，我真的不喜欢我采用事务系统并在其中引入外部系统（Mercurial/Git）。最重要的是，数据库备份不会备份我的 Mercurial 或 Git 存储库。但总会有一个权衡，所以只要弄清楚你可以接受哪些。

回复收藏 0 原文

~没有更多了~

关于作者

许仙没带伞

暂无简介

0 文章

0 评论

24 人气

关注发私信

友情链接

文江博客

在 PostgreSQL 中存储二进制文件的多个修订版本的最有效方法是什么？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

留蓝

18790681156

zach7772

Wini

ayeshaaroy

初雪

友情链接

在 PostgreSQL 中存储二进制文件的多个修订版本的最有效方法是什么？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（2）

关于作者

相关话题

热门标签

推荐作者

留蓝

18790681156

zach7772

Wini

ayeshaaroy

初雪

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。