使用 CVS 或 Subversion 作为 Office 文档备份框架的技巧
我长期以来一直使用 Subversion(以及之前的 CVS)不仅存储源文件,而且后来存储用于我的研究的 LaTeX 文件,最终存储一些 Word 文件和其他材料。
我喜欢这样一个事实:我可以使用多台计算机并同步每台计算机的最新内容,同时仍然能够维护备份和项目的某些层次结构。
我确信我不是唯一一个这样做的人。
我现在正在考虑使用 CVS 或 subversion 作为家庭计算机的主要备份机制,其中包含许多经常更改的办公文档。 这是一个好/坏主意吗? 我能想到的主要问题是这些文件被认为是二进制的,所以服务器会有点膨胀。
不过,我想听听我应该注意或注意的其他事情。
此外,我在哪里可以找到可以自动签入的脚本的好示例?
I've long been using subversion (and before that CVS) to store not only source files but later the LaTeX files for my research and eventually some word files and other materials.
I like the fact that I can work with multiple computers and synchronize the latest things from each, while still being able to maintain some hierarchy of my backups and projects.
I'm sure I can't be the only one doing it.
I am now thinking of using CVS or subversion as a primary backup mechanism for a family computer that includes many frequently changing office documents. Is this a good/bad idea? The main issue I can think of is that the files are considered binary so the server will bloat a little.
However, I would like to hear about other things I should be aware of or careful about.
In addition, where can I find good examples of scripts that can automate checkins?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
事实上,我真的不建议这样做,因为我以前也曾走过这条路。 首先,我认为不言而喻,如果您使用 SCM 存储库来执行此类任务,使用 SVN 而不是 CVS!对于这样的情况,几乎可以保证您将存储二进制数据,这对于 CVS 来说是一个巨大的痛苦。
不管怎样,我自己曾经在 SVN 存储库中存储了很多非编程相关的东西,但现在只使用时间机器来备份我关心的文件,以及一个用于我的点文件等的基于 Web 的小型存储库。 我认为阻碍的关键是您对普通数据文件的态度与对源代码的态度不同。 在大多数情况下,您不太可能有兴趣比较您编写的报告的两个版本,或者将您的工作副本版本恢复为您两周前编写的某些草稿。 对于此类文档,您通常只关心最新版本,而 SCM 提供的工具和安全性在这方面往往是烦人多于有用,尤其是在签入注释、合并等方面。
另外,我强烈不建议(这是一个词吗?;))让非程序员使用SCM。 所需的解释量太大,该工具对他们没有任何好处,特别是当应用于该工具最初不打算用于的任务时。 我在一些我们认为这不会成为问题的环境中完成了这项工作,因为相关人员并不愚蠢,而且他们正在处理与软件相关的工件。 但不可避免地,合并冲突和其他 SCM“陷阱”导致了混乱,最终,我在晚上接到了电话。
我想说你应该考虑像 Sharepoint 这样的文档共享门户来协作办公文档等。 它们更适合处理这些类型的事情,不会给非技术人员带来很多麻烦,并且可以优雅地处理版本历史、二进制数据等。这对您的家人来说可能有点过分了,但是建立一个小门户保存重要数据应该不是什么大问题——您只需要四处看看并找到适合您需求的东西即可。
Actually, I don't really recommend doing this, having been down this path before. First of all, I think that it goes without saying that if you use an SCM repository for such a task, use SVN instead of CVS! This goes double for a situation like this where it is almost guaranteed that you'd be storing binary data, which is a huge pain with CVS.
Anyways, I used to store a lot of non-programming related stuff in SVN repositories myself, but now only use time machine to back up the files that I care about, and a small web-based repo for my dotfiles and such. I think that the key thing which gets in the way is that you don't really have the same attitude with normal data files as you do with source code. It's very unlikely, in most cases, that you're interested in diff'ing two versions of a report you wrote, or reverting your working copy version to some draft you wrote two weeks ago. With such documents, you generally only care about the latest version, and the tools and security that SCM provides tend to be more annoying than helpful in this regard, especially when it comes to check-in comments, merging, and so on.
Also, I highly un-recommend (is that a word? ;) ) making non-programmers use SCM. The amount of explanation needed is too great for the tool to be of benefit for them, especially when applied to a task which the tool was not originally intended for. I've done this in a few environments where we thought it wouldn't be a problem, since the individuals in question were not stupid, and they were dealing with artifacts related to the software. But inevitably merge conflicts and other SCM "gotchas" resulted in confusion, and ultimately, phone calls to me during the evening hours.
I'd say you should look into document sharing portals like Sharepoint for collaborating office documents and such. They are better designed for dealing with these type of things without causing a lot of headache to non-technical folks, and can gracefully deal with version history, binary data, etc. This might be overkill for your family, but setting up a little portal to hold important data shouldn't be much of a problem -- you just need to look around a bit and find something that fits your needs.
您应该研究 subversion 的 Autoversioning 选项。 例如,它允许您设置任何计算机都可以查看和写入的网络共享,但只要写入数据,它就会自动执行必要的提交操作。 如果有人不小心删除了他们的文档,可以使用 subversion 命令将其恢复。
You should investigate subversion's Autoversioning option. It allows you to, for example, set up a network share that any computer can see and write to, but which automatically performs the necessary commit actions whenever data is written. If someone accidentally deletes their document, it can be recovered using the subversion commands.
事实上这并不是一个坏主意。 Subversion 使用 xdelta 来存储二进制文档中的差异,因此服务器不会因不同版本而超载。
至于脚本,进入机器时只需 svn update 即可,离开时不要忘记执行 svn commit 。 我多年来一直对自己的数据这样做,一点问题都没有。
Actually this is not a bad idea. Subversion uses xdelta to store differences even in binary documents, so the server is not so overloaded with different versions.
As for the scripts, just svn update when you enter a machine, and don't forget to do a svn commit when you leave. I've been doing this for my own data for several years and no problem at all.
我使用 Mercurial 来保存办公文档。 它满足我需要的一切 -
……等等。 它所做的大部分工作对我作为程序员很有用,但它也使文档管理变得足够简单。
哦,还有一个主要好处 - 在 Mac 上,许多“文档”实际上是包含一堆文件的文件夹。 使用 SVN,您会很快将这些文件夹与应用程序删除但 SVN 希望保留的文件弄乱。 使用 Mercurial,当您删除文件时,该修订版的文件会被“删除”,但如果您查看以前的修订版,它就会回来! Mac 应用程序的完美解决方案!
I use Mercurial for saving office documents. It does everything I need -
... and much more. Most of what it does is useful to me as a programmer, but it also keeps things simple enough for document management.
Oh, and one major benefit - on the Mac, many "documents" are actually folders that have a bunch of files in them. With SVN, you would quickly clutter these folders with files that the application deleted but SVN wants to keep around. With Mercurial, when you delete a file, it gets "deleted" for that revision, but then if you check out a previous revision, it comes back! The perfect solution for Mac apps!
http://wiki.documentfoundation.org/Libreoffice_and_subversion
密钥似乎使用 .fodt 而不是 . odt 格式。 无压缩,传统的 diff、patch 系统运行良好,无需将所有文档视为二进制文件!
http://wiki.documentfoundation.org/Libreoffice_and_subversion
The key appears to be using the .fodt rather than .odt format. Sans compression the traditional diff,patch system works well without treating all your documents like binaries!
是和不是。
假设您正确使用版本控制(CVS 或任何其他系统),这是跟踪文件旧版本的好方法,并且您更有可能在进行某些特定更改之前或之后找到文件的该版本如果你有的话。
然而,版本控制并不能保护您免受灾难的影响,例如数据的实际丢失(电涌耗尽了您的计算机及其所有磁盘)。 因此,您需要定期将存储库备份到一些安全的媒体上。 根据其重要性,只需将其写入 CD/DVD 以及存储在其他位置的另一个移动磁盘就足够了。
此外,您的版本控制存储库可能会发生不好的事情。 由于版本控制软件中的错误、崩溃、冲突等,部分内容可能会损坏。 更糟糕的是,您可能实际上没有注意到它,直到有一天您发现某些版本无法恢复。 因此,在备份之前制定一些程序来检查存储库的一致性。 并且不要一次覆盖所有备份。
Yes and no.
Assuming that you use version control (CVS or any other system) properly, it is a good way to keep track of old versions of files, and you are more likely to find that version of your file exactly before or after you did some particular change if you have it.
However version control does not protect you against disasters, such as actual loss of your data (power surge ate your machine with all its disks). So you need to backup your repository regularly to some safe media. Depending on how critical it is, just writing it to CDs/DVDs and another mobile disk stored somewhere else may be enough.
Also, bad things can happen to your version control repository. Due to bugs in the version control software, crashes, collisions, or the like, part of it may become corrupted. Worse, you may not actually notice it until you find out one day that some versions can not be recovered. So have some procedure for consistency-checking your repository before backups. And don't override all backups at once.
为什么不直接使用文档备份工具呢? 如果您不需要修订而只需要最新版本,那么备份工具/计划备份可能是最好的。
如果你想进行修改,那就继续你的计划。 在这种情况下,我唯一可以建议的是一个计划任务,该任务在所有文件的根目录下进行提交。 提交注释可以是“自动提交[日期/时间]”,并且我将为任务使用不同的用户名。
Why not just use a backup tool for your documents? If you don't need revisions and only need the latest one, then a backup tool/scheduled backups is probably best.
If you want to get at revisions, then go ahead with your plan. The only thing I could suggest in that case is a scheduled task that does a commit at the root of all your files. The commit comment could be "automated commit [date/time]" and I would have a different user name for the tasks.
我尝试在国内 NAS 驱动器上使用 TortoiseSVN,但它不起作用 - 我想是因为该磁盘是 FAT32。 我使用 SyncBack 来维护我的多个副本在家中保存文件,因为我不关心维护修订历史记录。
I tried using TortoiseSVN on a domestic NAS drive, but it didn't work - I think because the disk was FAT32. I use SyncBack to maintain multiple copies of my files at home, since I'm not bothered about maintaining a revision history.