从旧的 Git 提交中删除私有信息
我有一个使用 Git 进行版本控制的项目,我想将其开源,但其中包含一些特定于其最初使用环境的私人信息。我将更改相关信息以从未包含在存储库中的配置文件加载。我意识到我应该首先这样做,但由于私人信息仍然存在于之前的提交中,我该如何将其从我的历史记录中删除呢?我是否只需要根据最新提交启动一个新存储库并丢失所有历史记录,或者是否有办法在删除任何私人信息记录的同时挽救当前存储库?
编辑:澄清一下,我不想完全删除包含此私人信息的文件,因为它们仍在使用。相反,我想删除/删除/更改其中某些字符串的出现。
I have a project versioned with Git that I'd like to make open source, but it has some private information in it that is specific to the environment in which it was originally used. I'm going to change the information in question to load from a config file which is not included in the repository. I realize I should have done this in the first place, but since the private information still exists in previous commits, how can I go about removing it from my history? Do I just have to start a new repository based on the latest commit and lose all my history or is there a way to salvage the current repository while removing any record of the private information?
Edit: To clarify, I don't want to completely remove the files that contain this private information, because they are still used. Rather, I want to remove/blank out/change the occurrence of certain strings within them.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
我建议使用 BFG Repo-Cleaner,这是一个更简单、更快的替代方案
git-filter-branch
专门设计用于从 Git 存储库中删除私有数据。使用说明提供了更详细的步骤,但核心位是只需:下载 BFG 的 jar (需要 Java 8 或更高版本)并运行此命令:
replacements.txt 文件应包含您想要执行的所有替换,格式如下(每行一个条目 - 请注意不应包含注释):
您的整个存储库历史记录将被扫描,并且所有非二进制文件(大小低于 1MB)将执行替换:任何匹配的字符串(不在您的最新提交中)都将被替换。
全面披露:我是 BFG Repo-Cleaner 的作者。
I'd recommend using the BFG Repo-Cleaner, a simpler, faster alternative to
git-filter-branch
specifically designed for removing private data from Git repos.The usage instructions give the steps in more detail, but the core bit is just: download the BFG's jar (needs Java 8 or above) and run this command:
The
replacements.txt
file should contain all the substitutions you want to do, in a format like this (one entry per line - note the comments shouldn't be included):Your entire repository history will be scanned, and all non-binary files (under 1MB in size) will have the substitutions performed: any matching string (that isn't in your latest commit) will be replaced.
Full disclosure: I'm the author of the BFG Repo-Cleaner.
我不久前为此写了一个脚本。您可以在这里找到它: https://gist.github.com/dound/76ea685c05c4a7895247457eb676fe69
(原始文章可从 archive.org 查看: https://web.archive.org/web/20160208235904/http://dound.com:80/2009/04/git-forever-remove-files -or-folders-from-history/)
该脚本构建在 git 上git 自带的 -filter-branch 工具。如果您好奇,可以在此处阅读有关从 git 存储库中删除文件的更多信息,但是使用上面链接中的脚本应该很容易,并且您真正需要完成删除该私人信息。
I wrote a script for this a little while ago. You can find it here: https://gist.github.com/dound/76ea685c05c4a7895247457eb676fe69
(original writeup viewable from archive.org: https://web.archive.org/web/20160208235904/http://dound.com:80/2009/04/git-forever-remove-files-or-folders-from-history/)
The script builds on the git-filter-branch tool which comes with git. If you're curious, you can read more about removing files from a git repo here, but using the script from the link above should be easy and all you really need to accomplish removing that private information.