Git 可以跟踪单个函数从一个文件到另一个文件的移动吗?如何?

发布于 2024-10-16 09:51:58 字数 890 浏览 4 评论 0原文

有几次,我遇到这样的说法:如果将单个函数从一个文件移动到另一个文件,Git 可以跟踪它。例如,此条目说:“Linus 说如果你将一个函数从一个文件移动到另一个文件,Git 会告诉你该单个函数在整个移动过程中的历史记录。”

但我对 Git 的一些底层设计有一点了解,但我不明白这是怎么可能的。所以我想知道......这是一个正确的说法吗?如果是这样,这怎么可能?

我的理解是,Git 将每个文件的内容存储为 Blob,每个 Blob 都有一个全局唯一的标识,该标识源自其内容和大小的 SHA 哈希值。然后 Git 将文件夹表示为树。任何文件名信息都属于 Tree,而不属于 Blob,因此文件重命名会显示为对 Tree 的更改,而不是对 Blob 的更改。

因此,如果我有一个名为“foo”的文件,其中包含 20 个函数,还有一个名为“bar”的文件,其中包含 5 个函数,并且我将其中一个函数从 foo 移至 bar(分别得到 19 和 6), Git 如何检测到我将该函数从一个文件移到了另一个文件?

根据我的理解,这将导致存在 2 个新的 blob(一个用于修改后的 foo,一个用于修改后的 bar)。我意识到可以计算差异来表明该函数已从一个文件移动到另一个文件。但我不明白该函数的历史记录如何可能与 bar 而不是 foo 相关联(无论如何都不是自动的)。

如果 Git 实际上要查看单个文件的内部,并为每个函数计算一个blob(这将是疯狂/不可行的,因为你必须知道如何解析任何可能的语言),然后我就可以看到这是如何可能的。

那么……这个说法到底正确与否呢?如果这是正确的,那么我的理解还缺少什么?

Several times, I have come across the statement that, if you move a single function from one file to another file, Git can track it. For example, this entry says, "Linus says that if you move a function from one file to another, Git will tell you the history of that single function across the move."

But I have a little bit of awareness of some of Git's under-the-hood design, and I don't see how this is possible. So I'm wondering ... is this is a correct statement? And if so, how is this possible?

My understanding is that Git stores each file's contents as a Blob, and each Blob has a globally unique identity which arises from the SHA hash of its contents and size. Git then represents folders as Trees. Any filename information belongs to the Tree, not to the Blob, so a file rename for example shows up as a change to a Tree, not to a Blob.

So if I have a file called "foo" with 20 functions in it, and a file called "bar" with 5 functions in it, and I move one of the functions from foo into bar (resulting in 19 and 6, respectively), how can Git detect that I moved that function from one file to another?

From my understanding, this would cause 2 new blobs to exist (one for the modified foo and one for the modified bar). I realize a diff could be calculated to show that the function was moved from one file to the other. But I don't see how history about the function could possibly become associated with bar instead of foo (not automatically, anyway).

If Git were to actually look inside of single files, and compute a blob per function (which would be crazy / infeasible, because you'd have to know how to parse any possible language), then I could see how this might be possible.

So ... is the statement correct or not? And if it is correct, then what is lacking in my understanding?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

鹊巢 2024-10-23 09:51:58

此功能是通过 gitblame -C提供的。

-C 选项驱动 git 尝试查找正在审查的文件中文本块的添加或删除与同一变更集中修改的文件之间的匹配。附加 -C -C-C -C -C 扩展搜索。

在测试存储库中使用 git Blame -C 亲自尝试,您会发现刚刚移动的代码块源自其所属的原始文件。

git helpblame 手册页:

整个文件重命名时会自动跟踪行的起源(目前没有选项可以关闭重命名跟踪)。要跟踪从一个文件移动到另一个文件的行,或者跟踪从另一个文件复制和粘贴的行等,请参阅 -C-M 选项。< /p>

This functionality is provided through git blame -C <file>.

The -C option drives git into trying to find matches between addition or deletion of chunks of text in the file being reviewed and the files modified in the same changesets. Additional -C -C, or -C -C -C extend the search.

Try for yourself in a test repo with git blame -C and you'll see that the block of code that you just moved is originated in the original file where it belonged to.

From the git help blame manual page:

The origin of lines is automatically followed across whole-file renames (currently there is no option to turn the rename-following off). To follow lines moved from one file to another, or to follow lines that were copied and pasted from another file, etc., see the -C and -M options.

檐上三寸雪 2024-10-23 09:51:58

Git 2.15 开始,git diff 现在支持 使用 --color-moved 选项检测移动的线条。它适用于跨文件移动。

显然,它适用于彩色终端输出。据我所知,没有选项可以以纯文本补丁格式指示移动,但这是有道理的。

对于默认行为,尝试

git diff --color-moved

该命令还接受选项,目前为 nodefaultplainzebradimmed_zebra (使用 git help diff 获取最新选项及其描述)。例如:

git diff --color-moved=zebra

至于如何完成它,您可以从此功能作者的电子邮件交换

As of Git 2.15, git diff now supports detection of moved lines with the --color-moved option. It works for moves across files.

It works, obviously, for colorized terminal output. As far as I can tell, there is no option to indicate moves in plain text patch format, but that makes sense.

For default behavior, try

git diff --color-moved

The command also takes options, which currently are no, default, plain, zebra and dimmed_zebra (Use git help diff to get the latest options and their descriptions). For example:

git diff --color-moved=zebra

As to how it is done, you can glean some understanding from this email exchange by the author of the functionality.

第几種人 2024-10-23 09:51:58

此功能的一部分位于 git gui Britain (+ 文件名)中。它显示文件行的注释,每行都指示文件的创建时间和上次更改时间。对于跨文件的代码移动,它将原始文件的提交显示为创建,并将其添加到当前文件的提交显示为最后更改。尝试一下。

我真正想要的是给 git log 作为一些参数,除了文件路径之外还提供行号范围,然后它将显示该代码块的历史记录。如果文档正确的话,就没有这样的选项。是的,从 Linus 的声明中我也认为这样的命令应该很容易获得。

A bit of this functionality is in git gui blame (+ filename). It shows an annotation of the lines of a file, each indicating when it was created and when last changed. For code movement across a file, it shows the commit of the original file as a creation, and the commit where it was added to the current file as last change. Try it.

What I really would want is to give git log as some argument a line number range additionally to a file path, and then it would show the history of this code block. There is no such option, if the documentation is right. Yes, from Linus' statement I too would think such a command should be readily available.

风为裳 2024-10-23 09:51:58

git 实际上根本不跟踪重命名。。重命名只是删除和添加,仅此而已。任何显示重命名的工具都会根据此历史信息重建它们。

因此,跟踪函数重命名是一个简单的事情,只需在事后分析每次提交中所有文件的差异即可。这没有什么特别不可能的。现有的重命名跟踪已经可以处理“模糊”重命名,其中对文件进行一些更改以及重命名;这需要查看文件的内容。寻找函数重命名也是一个简单的扩展。

我不知道基本的 git 工具是否真的做到了这一点——它们试图保持语言中立,而函数识别在很大程度上不是语言中立的。

git doesn't actually track renames at all. A rename is just a delete and add, that's all. Any tools who show renames reconstruct them from this history information.

As such, tracking function renames is a simple matter of analyzing the diffs of all files in each commit after the fact. There's nothing particularly impossible about it; the existing rename tracking already handles 'fuzzy' renames, in which some changes are done to the file as well as renaming it; this requires looking at the contents to the files. It would be a simple extension to look for function renames as well.

I don't know if the base git tools actually do this however - they try to be language neutral, and function identification is very much not language neutral.

丘比特射中我 2024-10-23 09:51:58

git diff 将向您显示某些行从 foo 中消失并重新出现在 bar 中。如果同一提交中这些文件没有其他更改,则更改将很容易被发现。

智能的 git 客户端将能够向您展示行如何从一个文件移动到另一个文件。具有语言感知能力的 IDE 将能够将这种变化与特定的功能相对应。

当文件被重命名时,会发生非常类似的情况。它只是在一个名称下消失并在另一个名称下重新出现,但任何合理的工具都能够注意到它并表示为重命名。

There's git diff that will show you that certain lines disappeared from foo and reappeared in bar. If there are no other changes in these files in the same commit, the change will be easy to spot.

An intellectual git client would be able to show you how lines moved from one file to another. A language-aware IDE would be able to correspond this change with a particular function.

A very similar thing happens when a file gets renamed. It just disappears under one name and reappears under another, but any reasonable tool is able to notice it and represent as a rename.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文