如何替换 git 历史记录中文件中的文本?
我一直使用基于界面的 git 客户端(smartGit),因此对 git 控制台没有太多经验。
但是,我现在需要替换历史记录中所有 .txt 文件中的字符串(因此,不会删除整个文件,而只是替换字符串)。我发现了以下命令:
git filter-branch --tree-filter 'git ls-files -z "*.php" |xargs -0 perl -p -i -e "s#(PASSWORD1|PASSWORD2|PASSWORD3)#xXxXxXxXxXx#g"' -- --all
我尝试了这个,不幸的是注意到虽然密码确实被更改,但所有二进制文件都已损坏。图像等都会被损坏。
有没有更好的方法来做到这一点,不会损坏我的二进制文件?
谢谢。
编辑:
我混淆了一些东西。导致二进制文件损坏的实际代码是:
$ git filter-branch --tree-filter "find . -type f -exec sed -i -e 's/originalpassword/newpassword/g' {} \;"
顶部的代码实际上删除了带有我的密码的所有文件,这很奇怪。
I've always used an interface based git client (smartGit) and thus don't have much experience with the git console.
However, I now face the need to substitute a string in all .txt files from history (so, not erasing the whole file but just substituting a string). I found the following command:
git filter-branch --tree-filter 'git ls-files -z "*.php" |xargs -0 perl -p -i -e "s#(PASSWORD1|PASSWORD2|PASSWORD3)#xXxXxXxXxXx#g"' -- --all
I tried this, and unfortunately noticed that while the password did get changed, all binary files got corrupted. Images, etc. would all be corrupted.
Is there a better way to do this that won't corrupt my binary files?
Thanks.
EDIT:
I got mixed up with something. The actual code that caused binary files to get corrupted was:
$ git filter-branch --tree-filter "find . -type f -exec sed -i -e 's/originalpassword/newpassword/g' {} \;"
The code at the top actually removed all files with my password strangely enough.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(7)
我建议使用 BFG Repo-Cleaner,这是一个更简单、更快的替代方案
git-filter-branch
专为重写 Git 历史记录中的文件而设计。您应该仔细按照以下步骤操作: https://rtyley.github.io/bfg -repo-cleaner/#usage - 但核心位就是这样:下载 BFG 的 jar(需要 Java 7 或更高版本)并运行此命令(其中
my-repo.git
是存储库裸克隆的文件夹名称):replacements.txt
文件应包含您想要执行的所有替换,格式如下(每行一个条目 - 请注意不应包含注释):您的整个存储库历史记录将被扫描,并且 < code>.php 文件(大小低于 1MB)将执行替换:任何匹配的字符串(不在您的最新提交中)都将被替换。
全面披露:我是 BFG Repo-Cleaner 的作者。
I'd recommend using the BFG Repo-Cleaner, a simpler, faster alternative to
git-filter-branch
specifically designed for rewriting files from Git history.You should carefully follow these steps here: https://rtyley.github.io/bfg-repo-cleaner/#usage - but the core bit is just this: download the BFG's jar (requires Java 7 or above) and run this command (where
my-repo.git
is the folder name of the bare clone of your repo):The
replacements.txt
file should contain all the substitutions you want to do, in a format like this (one entry per line - note the comments shouldn't be included):Your entire repository history will be scanned, and
.php
files (under 1MB in size) will have the substitutions performed: any matching string (that isn't in your latest commit) will be replaced.Full disclosure: I'm the author of the BFG Repo-Cleaner.
您可以通过将
-name "pattern"
传递给find
来避免触及不需要的文件。这对我有用:
You can avoid touching undesired files by passing
-name "pattern"
tofind
.This works for me:
在 Git 2.24(2019 年第 4 季度)中,
git filter-branch
(和 BFG)已弃用。newren/git-filter-repo
< /a> 确实不做你想做的事。它有一个示例,几乎是您想要的 示例部分:
带有
expressions.txt
:但是,警告< /strong>:正如 Hasturkun 添加在 评论
这是有道理的,考虑到
--replace-text
选项本身就是一个 blob 回调。2024 年第一季度,
newren/git-filter-repo< /code> 第 74 期
建议(来自 Daniil):
With Git 2.24 (Q4 2019),
git filter-branch
(and BFG) is deprecated.newren/git-filter-repo
does NOT do what you want.It has an example that is ALMOST what you want in its example section:
with
expressions.txt
:However, WARNING: As Hasturkun adds in the comments
Which makes senses, considering the
--replace-text
option is itself a blob callback.Q1 2024,
newren/git-filter-repo
issue 74 proposes (from Daniil):有关 git-filter-repo 的更多信息
https://stackoverflow.com/ a/58252169/895245 提供了基础知识,这里有一些更多信息。
安装
从 git 2.5 开始,至少它不随主线 git 一起提供,因此:https://superuser.com/questions/1563034/how-do-you-install-git-filter-repo/1589985#1589985
使用提示
这是我倾向于使用的更常见的方法:
其中:
Bash 进程替换允许我们不创建用于简单替换的文件。如果您的 shell 不支持此功能,您只需将其写入文件即可:
HEAD
使其仅影响当前分支仅修改一系列提交
如何使用 git filter-repo 仅修改一系列提交而不是整个分支历史记录?
使用Python API替换
对于更复杂的替换,可以使用Python API,参见:如何使用 git filter-repo 作为带有 Python 模块接口的库?
More info on
git-filter-repo
https://stackoverflow.com/a/58252169/895245 gives the basics, here is some more info.
Install
As of git 2.5 at least it is not shipped with mainline git so:https://superuser.com/questions/1563034/how-do-you-install-git-filter-repo/1589985#1589985
Usage tips
Here is the more common approach I tend to use:
where:
Bash process substitution allows us to not create a file for simple replaces. If your shell does not support this feature, you just have to write it to a file instead:
HEAD
makes it affect only the current branchModify only a range of commits
How to modify only a range of commits with git filter-repo instead of the entire branch history?
Replace using the Python API
For more complex replacements, you can use the Python API, see: How to use git filter-repo as a library with the Python module interface?
我在 /usr/local/git/findsed.sh 创建了一个文件,其中包含以下内容:
我运行了命令:
命令说明
当您运行 git filter-branch 时,它会遍历每个修订版你曾经承诺过,一一承诺。 --tree-filter 在每个提交的修订上运行founded.sh 脚本,保存它,然后进入下一个修订。
find 命令查找特定文件或文件集,并在该文件上执行 (-exec) sed 编辑器。 sed 是一个命令,它采用 s/ 之后的正则表达式并将其替换为 / 和 /g 之间的字符串(在我的示例中为空白)。 {} 是对 find 命令给出的文件路径的引用。文件路径被提供给 sed,以便 sed 知道要处理什么。 \;只是结束 -exec 命令。
将 shell 脚本和命令分成单独的部分可以减少引用 '' 或 "" 时的复杂性。
特点
我在 Mac 上成功实现了这个,显然 sed 是 Mac 上的一个特定(较旧的?)版本。这很重要,因为它有时表现不同。确保执行 sed -i '' ,否则它会在文件末尾添加一个“-e”,认为这就是我想要命名的备份文件。 -i '' 表示不创建备份文件,只需就地编辑文件,不需要备份文件。
指定 -name 'filename.sh' 帮助我避免了另一个我无法解决的问题。还有另一个带有 .sh 的文件,该文件结束时没有换行符。 sed 由于某种原因,会在末尾添加一个换行符,尽管 's/blah/blah/g' 与该文件中的任何内容都不匹配。因此,我没有解决这个问题,而是告诉 find 忽略所有其他文件。
有效的其他命令
此外,我发现这些命令可以在finded.sh 文件中使用(一次只能使用一个命令,不能使用多个命令,因此请将其他命令注释掉):
享受吧!
I created a file at /usr/local/git/findsed.sh , with the following contents:
I ran the command:
Explanation of commands
When you run git filter-branch, this goes through each revision that you ever committed, one by one. --tree-filter runs the findsed.sh script on each committed revision, saves it, then progresses to the next revision.
The find command finds a specific file or set of files and executes (-exec) the sed editor on that file. sed is a command that takes the regex after s/ and replaces it with the string between / and /g (blank in my example). {} is a reference to the files path that was given by the find command. The file path is fed to sed, so that sed knows what to work on. \; just ends the -exec command.
Seperating the shell script and command out into seperate pieces allows for less complication when it comes to quotes '' or "".
Peculiarities
I successfully implemented this on a mac, and apparently sed is a particular (older?) version on macs. This matters, as it sometimes behaves differently. Make sure to do sed -i '' or else it was adding a "-e" to the end of files, thinking that that was what i wanted to name my backup files. -i '' says dont make backup files, just edit the files in place and no backup file needed.
Specifying -name 'filename.sh' helped me avoid another issue that I could not solve. There was another file with .sh and that file ended without a newline character. sed for some reason, would add a newline character to the end, despite the 's/blah/blah/g' not matching anything in that file. So instead of figuring out that issue, I just told the find to ignore all other files.
Additional commands that work
Additionally, I found these commands to work in the findsed.sh file (only one command at a time, not multple, so comment # the others out):
Enjoy!
可能是 shell 扩展问题。如果 filter-branch 在执行命令时丢失了
"*.php"
周围的引号,则它可能会扩展为空,因此git ls-files -z
列出所有文件。您可以检查过滤器分支源代码或尝试不同的引用技巧,但我要做的只是制作一个单行 shell 脚本来执行树过滤器并传递该脚本。
Could be a shell expansion issue. If filter-branch is losing the quotes around
"*.php"
by the time it evaluates the command, it may be expanding to nothing, thusgit ls-files -z
listing all files.You could check the filter-branch source or trying different quoting tricks, but what I'd do is just make a one-line shell script that does your tree-filter and pass that script instead.
由于 Google 中出现了
git 替换历史记录中的文本
,并且由于使用非 git 工具有时麻烦大于其价值,因此这里有一个将替换多行文本的命令 strong> 从${COMMIT}
一直到HEAD
。警告:这不适合初学者。它使用 git filter-branch ,所以它的所有警告/陷阱/等等。申请。确保您已提交/备份了需要保存的所有内容,这样就不会丢失数据。
话虽如此,在 Bash 中创建别名,如下所示:
然后您可以从 Bash 调用它,如下所示:
请注意,这执行文字文本替换,而不是正则表达式替换。
如果您需要正则表达式,则需要删除 Perl 命令中的
\Q
和\E
(执行转义),并根据 < 需要正确转义字符串。 code>s/$q/$s/sgm 自己命令。如果你想漂亮地打印脚本,你可以将其格式化如下:
Since this comes up in Google for
git replace text in history
, and since using non-git tools is sometimes more trouble than it's worth, here's a command that will replace multi-line text all the way from${COMMIT}
onwards toHEAD
.Warning: This is NOT for beginners. It uses
git filter-branch
, so all of its caveats/pitfalls/etc. apply. Make sure you've committed/backed up everything you need to save, so you don't lose data.With that said, create the alias in Bash as follows:
and you can then invoke it from Bash as follows:
Note that this performs literal text replacement, not regular expression replacement.
If you need regexes, you'll need to remove the
\Q
and\E
in the Perl command (which perform escaping) and properly escape the strings as needed for thes/$q/$s/sgm
command yourself.And if you want to pretty-print the script, you can format it like this: