文档管理、SCM?
这可能不是一个硬核编程问题,但我怀疑它与程序员使用的一些工具有关。
所以我们是一群人,每个人都有一堆文档和一堆不同的计算机,运行着一堆操作系统(好吧,只有 2 个,linux 和 windows)。存储/管理这些文档的最佳方式是它们可以离线使用(笔记本电脑可能并不总是在线),而且可以在所有计算机之间同步。对我来说,拥有一台具有额外可靠存储的服务器作为“基础存储库”似乎是个好主意。
我想到了使用 SCM,并且尝试过 Subversion,它使用集中式存储库似乎是一件好事 - 但是:
- 签出时,签出的总大小大约是原始大小的两倍。
- 大文件或大存储库似乎会减慢速度。
我也尝试过 rsync,它可能有用 - 但当涉及到潜在的冲突时,它有点粗糙。
最后,我尝试了 Unison(我认为这是 rsync 的包装),虽然它有效,但对于我们这里的大目录来说,它变得非常慢,因为它必须扫描所有内容。
所以问题是 - 是否有一种 SCM 工具实际上可以用于处理一大堆小文件和大文件? 如果答案是否定的——有人知道其他可以完成这项工作的工具吗?
感谢您的阅读:)
This might not be a hard core programming question, but it's related to some of the tools used by programmers I suspect.
So we're a bunch of people each with a bunch of documents and a bunch of different computers on a bunch of operating systems (well, only 2, linux and windows). The best way these documents can be stored/managed is if they were available offline (the laptop might not always be online) but also synchronized between all the machines. Having a server with extra reliable storage be a "base repository" seems like a good idea to me.
Using a SCM comes to my mind and I've tried Subversion, and it seems to be a good thing that it uses a centralized repository - but:
- When checking out the total size of the checkout is roughly double the original size.
- Big files or big repositories seem to slow it down.
Also I've tried rsync, which might work - but it's a bit rough when it comes to the potential conflict.
Finally I've tried Unison (which is a wrapping of rsync, I think) and while it works it becomes horribly slow for the big directories we have here since it has to scan everything.
So the question is - is there a SCM tool out there that is actually practial to use for a big bunch of both small and big files?
If thats a NO - does anyone know other tools that do this job?
Thanks for reading :)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您可以尝试分布式版本控制系统,例如 Mercurial、Git 或 Bazaar。似乎其中之一非常适合您想要实现的目标。
Joel Spolskey 在这里有一个很棒的小善变教程:hginit.com。谢谢卡麦因克。
You can try on of the distributed version control systems, like Mercurial, Git or Bazaar. Seems that one of those is perfect for what are you trying to accomplish.
Joel Spolskey has a great little mercurial tutorial here: hginit.com. Thanks camainc.
一些细节将使我们能够提供更有意义的答案。例如:
什么类型的文件?您正在处理图像、Word 文档、文本文件吗?以上全部或全部都没有?
Subversion(以及任何有价值的源代码控制系统)的工作原理是仅保存用于签入的增量。也就是说,当您签入文件时,仅保存该文件与先前版本之间的差异。这使得更容易节省空间。签入更改了几个像素的 1MB Photoshop 将比全新文档占用更少的存储空间。这通常与文件类型无关(即,它适用于二进制文件和文本)。
如果您签出的文件比签入的文件大,我会说您遇到了某种配置或流程问题。如果您签入 200KB 的文件,您将在签出时收到 200KB 的文件。您能描述一下您的结账/修改/签入流程吗?
SVN、TFS 等在许多不同的环境中非常大规模使用,并且它是一种简单、免费且非常可靠的解决方案。但是,如果您的受众主要是非程序员,则更加用户友好的 SCM 可能是更好的选择。
Some details will allow us to provide a more meaningful answer. For instance:
What types of documents? Are you dealing with images, Word documents, text files? All or none of the above?
Subversion (and any source control system worth its salt) works by saving only the deltas for checkins. That is, when you check in a file, only the differences between that file and the previous version are saved. This makes it easier to save space. Checking in a 1MB Photoshop that has a few pixels changed will take up less repository space that an entirely new document. This is typically file-type agnostic (ie, it works for binaries as well as text).
If your checkouts are resulting in files that are larger than what was checked in, I'd say you have some sort of configuration or process problem. If you check in a 200KB file, you will receive a 200KB file on check out. Could you describe your checkout/modify/checkin process?
SVN, TFS and others are used on very large scales in many different environments, and it's an easy, free and very reliable solution. However, if your audience is predominantly non-programmers, a more user-friendly SCM may be a better choice.