Git 中的行结尾混乱 - 如何在修复大量行结尾后跟踪另一个分支的更改?

发布于 2024-07-24 10:02:49 字数 419 浏览 8 评论 0原文

我们正在使用定期更新的第三方 PHP 引擎。 版本保存在 git 中的单独分支上,我们的分支是主分支。

这样我们就可以将新版本的引擎的补丁应用到我们的分支上。

我的问题是,在对我们的分支进行多次提交后,我意识到引擎的初始导入是用 CRLF 行结尾完成的。

我将每个文件都转换为 LF,但这做出了巨大的提交,删除了 100k 行并添加了 100k 行,这显然破坏了我们的初衷:轻松合并来自第 3 方引擎的工厂版本的补丁。

我能知道什么? 我怎样才能解决这个问题? 我已经在我们的分叉上进行了数百次提交。

最好的办法是在初始导入之后和分支我们自己的分支之前以某种方式进行行结尾修复提交,并在历史稍后删除那个巨大的行结尾提交。

但是我不知道如何在 Git 中执行此操作。

谢谢!

We are working with a 3rd party PHP engine that gets regular updates. The releases are kept on a separate branch in git, and our fork is the master branch.

This way we'll be able to apply patches to our fork from the new releases of the engine.

My problem is, after many commits to our branch, I realized that the initial import of the engine was done with CRLF line endings.

I converted every file to LF, but this made a huge commit, with 100k lines removed and 100k lines added, which obviously breaks what we intended to do: easily merge in patches from the factory releases of that 3rd party engine.

What whould I do know? How can I fix this? I already have hundreds of commits on our fork.

What would be good is to somehow do a line endings fix commit after the initial import and before branching our own fork, and removing that huge line ending commit later in history.

However I have no idea how to do this in Git.

Thanks!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

自在安然 2024-07-31 10:02:49

我终于设法解决了它。

答案是:

git filter-branch --tree-filter '~/Scripts/fix-line-endings.sh' -- --all

fix-line-endings.sh 包含:

#!/bin/sh
find . -type f -a \( -name '*.tpl' -o -name '*.php' -o -name '*.js' -o -name '*.css' -o -name '*.sh' -o -name '*.txt' -iname '*.html' \) | xargs fromdos

在所有提交中的所有树中修复所有行结尾之后,我进行了交互式变基并删除了所有修复行结尾的提交。

现在我的存储库是干净和新鲜的,准备好推送了:)

访客请注意:如果您的存储库已被推送/克隆,请不要这样做,因为它会把事情搞得一团糟!

I finally managed to solve it.

The answer is:

git filter-branch --tree-filter '~/Scripts/fix-line-endings.sh' -- --all

fix-line-endings.sh contains:

#!/bin/sh
find . -type f -a \( -name '*.tpl' -o -name '*.php' -o -name '*.js' -o -name '*.css' -o -name '*.sh' -o -name '*.txt' -iname '*.html' \) | xargs fromdos

After all line endings were fixed in all trees in all commits, I did an interactive rebase and removed all commits that were fixing line endings.

Now my repo is clean and fresh, ready to be pushed :)

Note to visitors: do not do this if your repo has been pushed / cloned because it will mess things up badly!

桃扇骨 2024-07-31 10:02:49

你看过git rebase吗?

您将需要重新设置存储库的历史记录,如下所示:

  • 提交行终止符修复
  • 启动重新设置
  • 离开第三方导入 提交首先
  • 应用行终止符修复
  • 应用您的其他补丁

但您需要了解的是这将破坏所有下游存储库 - 那些从父存储库克隆的存储库。 理想情况下,您将从头开始。


更新:示例用法:

target=`git rev-list --max-count=3 HEAD | tail -n1`
get rebase -i $target

将为最后 3 次提交启动变基会话。

Did you look at git rebase?

You will need to re-base the history of your repository, as follows:

  • commit the line terminator fixes
  • start the rebase
  • leave the third-party import commit first
  • apply the line terminator fixes
  • apply your other patches

What you do need to understand though is that this will break all downstream repositories - those that are cloned from your parent repo. Ideally you will start from scratch with those.


Update: sample usage:

target=`git rev-list --max-count=3 HEAD | tail -n1`
get rebase -i $target

Will start a rebase session for the last 3 commits.

窗影残 2024-07-31 10:02:49

展望未来,请通过 core.autocrlf 设置避免此问题,记录在 git config --help

core.autocrlf

如果为 true,则使 git 在从文件系统读取时将文本文件行尾的 CRLF 转换为 LF,并在写入文件系统时反向转换。 该变量可以设置为 input,在这种情况下,仅在从文件系统读取时才会发生转换,但文件会在行尾使用 LF 写出。 根据文件的 crlf 属性,或者如果 crlf,文件被视为“文本”(autocrlf 机制的约束) 未指定,具体取决于文件的内容。 请参阅 gitattributes

Going forward, avoid this problem with the core.autocrlf setting, documented in git config --help:

core.autocrlf

If true, makes git convert CRLF at the end of lines in text files to LF when reading from the filesystem, and convert in reverse when writing to the filesystem. The variable can be set to input, in which case the conversion happens only while reading from the filesystem but files are written out with LF at the end of lines. A file is considered "text" (i.e. be subjected to the autocrlf mechanism) based on the file's crlf attribute, or if crlf is unspecified, based on the file's contents. See gitattributes.

亽野灬性zι浪 2024-07-31 10:02:49

我们将来会通过以下方式避免这个问题:

1)每个人都使用一个删除尾随空格的编辑器,并且我们用 LF 保存所有文件。

2) 如果 1) 失败(它可以 - 有人出于某种原因意外地将其保存在 CRLF 中),我们有一个预提交脚本来检查 CRLF 字符:

#!/bin/sh
#
# An example hook script to verify what is about to be committed.
# Called by git-commit with no arguments.  The hook should
# exit with non-zero status after issuing an appropriate message if
# it wants to stop the commit.
#
# To enable this hook, rename this file to "pre-commit" and set executable bit

# original by Junio C Hamano

# modified by Barnabas Debreceni to disallow CR characters in commits


if git rev-parse --verify HEAD 2>/dev/null
then
    against=HEAD
else
    # Initial commit: diff against an empty tree object
    against=4b825dc642cb6eb9a060e54bf8d69288fbee4904
fi

crlf=0

IFS="
"
for FILE in `git diff-index --cached $against`
do
    fhash=`echo $FILE | cut -d' ' -f4`
    fname=`echo $FILE | cut -f2`

    if git show $fhash | grep -EUIlq 

该脚本使用 GNU grep,并且可以在 Mac OS X 上运行,但它应该是在其他平台上使用之前进行了测试(我们在使用 Cygwin 和 BSD grep 时遇到了问题)

3)如果我们发现任何空格错误,我们对错误文件使用以下脚本:

#!/usr/bin/env php
<?php

    // Remove various whitespace errors and convert to LF from CRLF line endings
    // written by Barnabas Debreceni
    // licensed under the terms of WFTPL (http://en.wikipedia.org/wiki/WTFPL)

    // handle no args
    if( $argc <2 ) die( "nothing to do" );


    // blacklist

    $bl = array( 'smarty' . DIRECTORY_SEPARATOR . 'templates_c' . DIRECTORY_SEPARATOR . '.*' );

    // whitelist

    $wl = array(    '\.tpl', '\.php', '\.inc', '\.js', '\.css', '\.sh', '\.html', '\.txt', '\.htc', '\.afm',
                    '\.cfm', '\.cfc', '\.asp', '\.aspx', '\.ascx' ,'\.lasso', '\.py', '\.afp', '\.xml',
                    '\.htm', '\.sql', '\.as', '\.mxml', '\.ini', '\.yaml', '\.yml'  );

    // remove $argv[0]
    array_shift( $argv );

    // make file list
    $files = getFileList( $argv );

    // sort files
    sort( $files );

    // filter them for blacklist and whitelist entries

    $filtered = preg_grep( '#(' . implode( '|', $wl ) . ')$#', $files );
    $filtered = preg_grep( '#(' . implode( '|', $bl ) . ')$#', $filtered, PREG_GREP_INVERT );

    // fix whitespace errors
    fix_whitespace_errors( $filtered );





    ///////////////////////////////////////////////////////////////////////////////////////////////
    ///////////////////////////////////////////////////////////////////////////////////////////////


    // whitespace error fixer
    function fix_whitespace_errors( $files ) {
        foreach( $files as $file ) {

            // read in file
            $rawlines = file_get_contents( $file );

            // remove \r
            $lines = preg_replace( "/(\r\n)|(\n\r)/m", "\n", $rawlines );
            $lines = preg_replace( "/\r/m", "\n", $lines );

            // remove spaces from before tabs
            $lines = preg_replace( "/\040+\t/m", "\t", $lines );

            // remove spaces from line endings
            $lines = preg_replace( "/[\040\t]+$/m", "", $lines );

            // remove tabs from line endings
            $lines = preg_replace( "/\t+$/m", "", $lines );

            // remove EOF newlines
            $lines = preg_replace( "/\n+$/", "", $lines );

            // write file if changed and set old permissions
            if( strlen( $lines ) != strlen( $rawlines )){

                $perms = fileperms( $file );

                // Uncomment to save original files

                //rename( $file, $file.".old" );
                file_put_contents( $file, $lines);
                chmod( $file, $perms );
                echo "${file}: FIXED\n";
            } else {
                echo "${file}: unchanged\n";
            }

        }
    }

    // get file list from argument array
    function getFileList( $argv ) {
        $files = array();
        foreach( $argv as $arg ) {
          // is a direcrtory
            if( is_dir( $arg ) )  {
                $files = array_merge( $files, getDirectoryTree( $arg ) );
            }
            // is a file
            if( is_file( $arg ) ) {
                $files[] = $arg;
            }
        }
        return $files;
    }

    // recursively scan directory
    function getDirectoryTree( $outerDir ){
        $outerDir = preg_replace( ':' . DIRECTORY_SEPARATOR . '$:', '', $outerDir );
        $dirs = array_diff( scandir( $outerDir ), array( ".", ".." ) );
        $dir_array = array();
        foreach( $dirs as $d ){
            if( is_dir( $outerDir . DIRECTORY_SEPARATOR . $d ) ) {
                $otherdir = getDirectoryTree( $outerDir . DIRECTORY_SEPARATOR . $d );
                $dir_array = array_merge( $dir_array, $otherdir );
            }
            else $dir_array[] = $outerDir . DIRECTORY_SEPARATOR . $d;
        }
        return $dir_array;
    }
?>
\r

该脚本使用 GNU grep,并且可以在 Mac OS X 上运行,但它应该是在其他平台上使用之前进行了测试(我们在使用 Cygwin 和 BSD grep 时遇到了问题)

3)如果我们发现任何空格错误,我们对错误文件使用以下脚本:



    then
        echo $fname contains CRLF characters
        crlf=1
    fi
done

if [ $crlf -eq 1 ]
then
    echo Some files have CRLF line endings. Please fix it to be LF and try committing again.
    exit 1
fi

exec git diff-index --check --cached $against --

该脚本使用 GNU grep,并且可以在 Mac OS X 上运行,但它应该是在其他平台上使用之前进行了测试(我们在使用 Cygwin 和 BSD grep 时遇到了问题)

3)如果我们发现任何空格错误,我们对错误文件使用以下脚本:

we are avoiding this problem in the future with:

1) everyone uses an editor which strips trailing whitespaces, and we save all files with LF.

2) if 1) fails (it can - someone accidentally saves it in CRLF for whatever reason) we have a pre-commit script that checks for CRLF chars:

#!/bin/sh
#
# An example hook script to verify what is about to be committed.
# Called by git-commit with no arguments.  The hook should
# exit with non-zero status after issuing an appropriate message if
# it wants to stop the commit.
#
# To enable this hook, rename this file to "pre-commit" and set executable bit

# original by Junio C Hamano

# modified by Barnabas Debreceni to disallow CR characters in commits


if git rev-parse --verify HEAD 2>/dev/null
then
    against=HEAD
else
    # Initial commit: diff against an empty tree object
    against=4b825dc642cb6eb9a060e54bf8d69288fbee4904
fi

crlf=0

IFS="
"
for FILE in `git diff-index --cached $against`
do
    fhash=`echo $FILE | cut -d' ' -f4`
    fname=`echo $FILE | cut -f2`

    if git show $fhash | grep -EUIlq 

This script uses GNU grep, and works on Mac OS X, however it should be tested before use on other platforms (we had problems with Cygwin and BSD grep)

3) In case we find any whitespace errors, we use the following script on erroneous files:

#!/usr/bin/env php
<?php

    // Remove various whitespace errors and convert to LF from CRLF line endings
    // written by Barnabas Debreceni
    // licensed under the terms of WFTPL (http://en.wikipedia.org/wiki/WTFPL)

    // handle no args
    if( $argc <2 ) die( "nothing to do" );


    // blacklist

    $bl = array( 'smarty' . DIRECTORY_SEPARATOR . 'templates_c' . DIRECTORY_SEPARATOR . '.*' );

    // whitelist

    $wl = array(    '\.tpl', '\.php', '\.inc', '\.js', '\.css', '\.sh', '\.html', '\.txt', '\.htc', '\.afm',
                    '\.cfm', '\.cfc', '\.asp', '\.aspx', '\.ascx' ,'\.lasso', '\.py', '\.afp', '\.xml',
                    '\.htm', '\.sql', '\.as', '\.mxml', '\.ini', '\.yaml', '\.yml'  );

    // remove $argv[0]
    array_shift( $argv );

    // make file list
    $files = getFileList( $argv );

    // sort files
    sort( $files );

    // filter them for blacklist and whitelist entries

    $filtered = preg_grep( '#(' . implode( '|', $wl ) . ')$#', $files );
    $filtered = preg_grep( '#(' . implode( '|', $bl ) . ')$#', $filtered, PREG_GREP_INVERT );

    // fix whitespace errors
    fix_whitespace_errors( $filtered );





    ///////////////////////////////////////////////////////////////////////////////////////////////
    ///////////////////////////////////////////////////////////////////////////////////////////////


    // whitespace error fixer
    function fix_whitespace_errors( $files ) {
        foreach( $files as $file ) {

            // read in file
            $rawlines = file_get_contents( $file );

            // remove \r
            $lines = preg_replace( "/(\r\n)|(\n\r)/m", "\n", $rawlines );
            $lines = preg_replace( "/\r/m", "\n", $lines );

            // remove spaces from before tabs
            $lines = preg_replace( "/\040+\t/m", "\t", $lines );

            // remove spaces from line endings
            $lines = preg_replace( "/[\040\t]+$/m", "", $lines );

            // remove tabs from line endings
            $lines = preg_replace( "/\t+$/m", "", $lines );

            // remove EOF newlines
            $lines = preg_replace( "/\n+$/", "", $lines );

            // write file if changed and set old permissions
            if( strlen( $lines ) != strlen( $rawlines )){

                $perms = fileperms( $file );

                // Uncomment to save original files

                //rename( $file, $file.".old" );
                file_put_contents( $file, $lines);
                chmod( $file, $perms );
                echo "${file}: FIXED\n";
            } else {
                echo "${file}: unchanged\n";
            }

        }
    }

    // get file list from argument array
    function getFileList( $argv ) {
        $files = array();
        foreach( $argv as $arg ) {
          // is a direcrtory
            if( is_dir( $arg ) )  {
                $files = array_merge( $files, getDirectoryTree( $arg ) );
            }
            // is a file
            if( is_file( $arg ) ) {
                $files[] = $arg;
            }
        }
        return $files;
    }

    // recursively scan directory
    function getDirectoryTree( $outerDir ){
        $outerDir = preg_replace( ':' . DIRECTORY_SEPARATOR . '$:', '', $outerDir );
        $dirs = array_diff( scandir( $outerDir ), array( ".", ".." ) );
        $dir_array = array();
        foreach( $dirs as $d ){
            if( is_dir( $outerDir . DIRECTORY_SEPARATOR . $d ) ) {
                $otherdir = getDirectoryTree( $outerDir . DIRECTORY_SEPARATOR . $d );
                $dir_array = array_merge( $dir_array, $otherdir );
            }
            else $dir_array[] = $outerDir . DIRECTORY_SEPARATOR . $d;
        }
        return $dir_array;
    }
?>
\r

This script uses GNU grep, and works on Mac OS X, however it should be tested before use on other platforms (we had problems with Cygwin and BSD grep)

3) In case we find any whitespace errors, we use the following script on erroneous files:



    then
        echo $fname contains CRLF characters
        crlf=1
    fi
done

if [ $crlf -eq 1 ]
then
    echo Some files have CRLF line endings. Please fix it to be LF and try committing again.
    exit 1
fi

exec git diff-index --check --cached $against --

This script uses GNU grep, and works on Mac OS X, however it should be tested before use on other platforms (we had problems with Cygwin and BSD grep)

3) In case we find any whitespace errors, we use the following script on erroneous files:

旧城空念 2024-07-31 10:02:49

One solution (not necessarily the best one) would be to use git-filter-branch to rewrite history to always use correct line endings. This should be better solution that interactive rebase, at least for larger number of commits; also it might be easier to deal with merges using git-filter-branch.

That is of course assuming that history was not published (repository was not cloned).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文