从 svn 存储库中删除包含所有历史记录的文件
有什么方法可以从 svn 存储库中删除文件(包括其所有历史记录)吗?当我想删除存储库中的大型二进制文件时,就会出现此问题。
我知道只有一种方法可能在这种情况下有所帮助:
- 在 svnadmin 实用程序的帮助下转储所有存储库。
- 使用
grep
过滤转储文件。 Grep 应该使用文件名并写入另一个转储文件 - 使用 svnadmin 导入最后一个转储文件
但这太复杂且不可靠。也许还有另一种解决方案?
Is there any way to delete file from svn repository including all its history? This issue emerges when I want to get rid of large binary file residing in repo.
I know only one approach that might help in this situation:
- Dump all repo with the help of
svnadmin
utility. - Filter dumped file with
grep
. Grep should use filename and write in to the other dump-file - Import last dump-file with
svnadmin
But this is too complicated and unreliable. Maybe there is another solution?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
最近使用命令 svndumpfilter 变得更加简单。详细信息可在 Subversion 文档 此处。基本上,为了避免冲突(此处进行了解释),它需要一个存储库转储并重做每个提交,包括或排除给定的文件前缀。基本语法:
排除可能是提问者正在寻找的内容,但您也可以使用包含来提取存储库的子树,以便将其分离为自己的存储库。
subversion 中的 subversion 的最新版本(非常元)也可以采用 glob 模式。我最近不得不从存储库中删除所有 pdf,这很容易完成,如下所示:
可以通过调用
svndumpfilter help
和svndumpfilter help except
找到更多使用信息。This has recently become much more straightforward with the command
svndumpfilter
. Details are available in the subversion documentation here. Basically, to avoid conflicts (explained here), it takes a repo dump and redoes each commit, either including or excluding a given file prefix. Basic syntax:Exclude is probably what the question asker is looking for, but you can also use include to, say, extract a subtree of the repo so as to spin it off as its own repository.
The latest revision of subversion in subversion (very meta) can also take glob patterns. I recently had to remove all pdfs from a repo and it was very easily done like so:
Further usage information can be found by calling
svndumpfilter help
andsvndumpfilter help exclude
.我不知道为什么这不应该被认为是可靠的。但是,如果您想完全删除该文件、历史记录等等,无论该文件对以前的修订有什么影响,只有一种方法可以做到这一点,而且这种方法确实很复杂。确实如此。 SVN 是一款只有一个目标的工具:永远不会丢失任何文件,即使在删除文件之后也是如此。强迫它做其他事情应该很困难。
I wouldn't know why this shouldn't be considered reliable. However, if you want to completely get rid of the file, history and all, no matter what the effect on previous revisions this file was part of, there only is one way to do so and that way is indeed complicated. And rightly so. SVN is a tool with one single goal: never ever to lose any file, even after it was deleted. Forcing it to do otherwise ought to be hard.
我遇到了类似的问题,除了我需要删除多个文件,而不仅仅是一个文件,而且我们使用的是 Subversion 1.6,它不支持 --patern 指令。
-- 备份当前 SVN
-- 转储存储库
-- 创建新转储,同时排除非常大的文件
-- 创建另一个新转储,同时排除另一个非常大的文件
-- 删除旧的 svn
-- 重新创建 svn 目录
-- 重新创建 SVN
- - 使用转储重新填充新存储库
- 将保存的副本中的conf文件更新到新副本...
现在存储库不应包含2个大文件“file.csv”和“anotherFile.csv”
I was facing a similar issue, except that I needed to remove multiple files, not just one file, and also we are on Subversion 1.6 which doesn't support the --patern directive.
-- backup current SVN
-- dump repository
-- create new dump while excluding the very large file
-- create another new dump while excluding another very large file
-- remove the old svn
-- recreate the svn directories
-- recreate the SVN
-- repopulate the fresh repository with the dump
-- update the conf files from the saved copy into the new copy...
Now the repository should not contain the 2 large files "file.csv" and "anotherFile.csv"
我同意 McDowell 的提议,但建议您考虑将大文件替换为仅包含已删除条目的文件哈希的文本文件。
例如,如果您因意外签入构建目录而产生大量 .o 文件,则这可能不合适。但是,如果您要从包含一堆您确实想要的二进制工件的目录中删除一堆您不想要的二进制工件,那么您就很有可能犯下代价高昂的错误。至少,考虑从主干和大多数分支中删除它们,但保留一个充满占位符文本文件的功能分支以及原始二进制文件的哈希值。这至少足以弄清楚以后发生了什么,验证不应该删除的杂散副本实际上是正确的文件,并将其放回修订控制之下。
并且,显然,在您考虑执行任何此类操作之前,请将整个存储库备份为只读状态,例如几个 M-Disc 或其他东西。
I agree with McDowell's proposal, but would like to suggest that you consider replacing the large file with a text file that simply contains the hash of the file for the removed entry.
If you have a huge number of, for example, .o files from accidentally checking in a build directory, this may not be appropriate. But if you are removing a bunch of binary artifacts you don't want from a directory that includes a bunch of binary artifacts you DO want, you are at high risk of making an expensive mistake. At a minimum, consider removing them from trunk and most branches, but leaving a feature branch full of placeholder text files with the hash of the original binary. This can at least be enough to figure out what happened later, verify that a stray copy that shouldn't have been deleted is in fact the right file, and put it back under revision control.
And, obviously, back the entire repo up to something read-only, like a couple of M-Discs or something, before you even think about doing any of this stuff.