有人采用了 Moodle 的一个版本(我不知道),在目录中应用了许多更改,然后发布了它(树在这里)。
如何确定原始项目的哪个提交最有可能被编辑以形成此树?
这将允许我使用此补丁在适当的提交处形成分支。当然它来自 1.8 或 1.9 分支,可能来自发布标签,但特定提交之间存在差异对我帮助不大。
事后更新: knittl 的回答让我尽可能接近。我首先将我的补丁存储库添加为远程“外部”(没有共同的提交,没关系),然后使用几个格式选项在循环中进行差异。第一个使用 --shortstat
格式:
for REV in $(git rev-list v1.9.0^..v1.9.5); do
git diff --shortstat "$REV" f7f7ad53c8839b8ea4e7 -- mod/assignment >> ~/rdiffs.txt;
echo "$REV" >> ~/rdiffs.txt;
done;
第二个只是在没有上下文的情况下计算统一差异中的行更改:
for REV in $(git rev-list v1.9.0^..v1.9.5); do
git diff -U0 "$REV" f7f7ad53c8839b8ea4e7 -- mod/assignment | wc -l >> ~/rdiffs2.txt;
echo "$REV" >> ~/rdiffs2.txt;
done;
有数千个提交需要挖掘,但是 这个似乎是最接近的匹配。
Someone took a version (unknown to me) of Moodle, applied many changes within a directory, and released it (tree here).
How can I determine which commit of the original project was most likely edited to form this tree?
this would allow me to form a branch at the appropriate commit with this patch. Surely it came from either the 1.8 or 1.9 branches, probably from a release tag, but diffing between particular commits doesn't help me much.
Postmortem Update: knittl's answer got me as close as I'm going to get. I first added my patch repo as the remote "foreign" (no commits in common, that's OK), then did diffs in loops with a couple format options. The first used the --shortstat
format:
for REV in $(git rev-list v1.9.0^..v1.9.5); do
git diff --shortstat "$REV" f7f7ad53c8839b8ea4e7 -- mod/assignment >> ~/rdiffs.txt;
echo "$REV" >> ~/rdiffs.txt;
done;
The second just counted the line changes in a unified diff with no context:
for REV in $(git rev-list v1.9.0^..v1.9.5); do
git diff -U0 "$REV" f7f7ad53c8839b8ea4e7 -- mod/assignment | wc -l >> ~/rdiffs2.txt;
echo "$REV" >> ~/rdiffs2.txt;
done;
There were thousands of commits to dig through, but this one seems to be the closest match.
发布评论
评论(5)
您可以编写一个脚本,将给定的树与存储库中的修订范围进行比较。
假设我们首先将更改后的树(没有历史记录)提取到我们自己的存储库中:
然后我们为要匹配的每个修订版输出 diffstat(简短形式):
查找具有最小更改量的提交(或使用某种排序)机制)
you could write a script, which diffs the given tree against a revision range in your repository.
assume we first fetch the changed tree (without history) into our own repository:
we then output the diffstat (in short form) for each revision we want to match against:
look for the commit with the smallest amount of changes (or use some sorting mechanism)
这是我的解决方案:
This was my solution:
这里有一些非常好的解决方案!
我使用类似的方法来尝试找到最接近的源文件修订版(给定目标文件):
merge
中的所有提交,target.txt 最接近的匹配
revision
以及不同文本行数NB 在新的一次性分支中执行 -
reset --hard
具有破坏性(据我所知)。您将得到如下所示的输出,它告诉您哪个版本是最接近的匹配(即差异最小的行):
进一步阅读:
来源:
Some really great solutions here!
I used something similar, to try and find the closet source file revision (given a target file):
merge
target.txt
revision
, and the number of differing lines of textN.B. perform inside a new, throw-away branch -
reset --hard
is destructive (afaik).You'll get output like the following, which tells you which revision was the closest match (i.e. least differing lines):
Further reading:
Credit:
如何使用 git 从 1.8 的所有版本创建补丁。和 1.9 到这个新版本。
然后你就可以看到哪个补丁更“有意义”。
例如,如果补丁“删除”了许多方法,那么它可能不是这个版本,而是之前的版本。如果补丁有许多部分作为单个编辑没有意义,那么它也可能不是这个版本。
等等……不幸的是,实际上,不存在一种算法可以完美地做到这一点。我必须采取一些启发式的做法。
How about using git to create a patch from all versions of 1.8. and 1.9 to this new release.
Then you could see which patch makes more 'sense'.
For example, if the patch 'removes' many methods, then it is probably not this release, but one before. If the patch has many sections that don't make sense as a single edit, then it probably isn't this release either.
And so on... In reality, unfortunately, there doesn't exist an algorithm to do this perfectly. I will have to be some heuristic.
使用“git Blame”怎么样?它会向您显示每一行的更改者和修订版本。
How about using 'git blame'? It will show you, for each line, who changed it, and in which revision.