git:如何将库从项目中分离出来?过滤器分支,子树?

发布于 2024-11-16 07:23:13 字数 2374 浏览 6 评论 0原文

所以,我有一个更大的(闭源)项目,并且在这个项目的背景下创建了一个库,我认为它在其他地方也很有用。

我现在想将这个库拆分到自己的项目中,该项目可以在 github 或类似项目上作为开源项目。当然,图书馆(及其历史)不应该包含我们项目的痕迹。

git-subtree 似乎是一个解决方案,但它并不完全适合。

我的目录布局是这样的(因为它是一个 Java 项目):

  • fencing-game (git workdir)
    • 源代码
        • 击剑游戏
          • 交通 (我的图书馆)
            • 协议 (库的一部分)
            • 击剑(与库交互的主项目的一部分)
            • 客户端(与库交互的主项目的一部分)
            • 服务器(与库连接的主项目的一部分)
          • 客户端(主项目的一部分)
          • 服务器(主项目的一部分)
          • ...(主项目的一部分)
    • 其他文件和目录(构建系统、网站等 - 主项目的一部分)

拆分后,我希望库的目录布局看起来像这样(包括直接在粗体目录中的任何文件):

  • my-library (名称待定)
    • 源代码
        • 击剑游戏
          • 交通 (我的图书馆)
            • 协议 (库的一部分)

历史记录还应该仅包含主项目历史记录中涉及存储库这一部分的部分。< /strong>

乍一看,我看到了 git-subtree split --prefix=src/de/fencing_ame/transport,但这会给

  1. 我一棵植根于 transport 的树(其中将不会编译)并
  2. 包含transport/clienttransport/servertransport/fencing 目录。

第一点可以通过在接收端使用 git subtree add --prefix=src/de/fencing_ame/transport来缓解,但我不认为 git-subtree 可以做太多事情反对导出这些子目录。 (这个想法实际上是能够在这里共享完整的树)。

我必须在这里使用 git filter-branch 吗?

拆分后,我希望能够使用 git-subtree 或 git-submodule 在单独的子目录中(而不是现在的位置)导入回主项目中的库。我想象这样的布局

  • 击剑游戏(git workdir)
    • 源代码
        • 击剑游戏
          • 运输(空)
            • 击剑(与库交互的主项目的一部分)
            • 客户端(与库交互的主项目的一部分)
            • 服务器(与库连接的主项目的一部分)
          • 客户端(主项目的一部分)
          • 服务器(主项目的一部分)
          • ...(主项目的一部分)
    • 我的图书馆
      • 源代码
          • 击剑游戏
            • 交通 (我的图书馆)
              • 协议 (库的一部分)
    • 其他文件和目录(构建系统、网站等 - 主项目的一部分)
What would be the most pain-free way to do this? Are there other tools than git-subtree and git-filter-branch for this goal?

So, I've a bigger (closed source) project, and in the context of this project created a library which could also be useful elsewhere, I think.

I now want to split off the library in its own project, which could go as open source on github or similar. Of course, the library (and its history there) should contain no traces of our project.

git-subtree seems like a solution here, but it does not completely fit.

My directory layout is something like this (since it is a Java project):

  • fencing-game (git workdir)
    • src
      • de
        • fencing_game
          • transport (my library)
            • protocol (part of the library)
            • fencing (part of the main project interfacing with the library)
            • client (part of the main project interfacing with the library)
            • server (part of the main project interfacing with the library)
          • client (part of the main project)
          • server (part of the main project)
          • ... (part of the main project)
    • other files and directories (build system, website and such - part of the main project)

After the split, I want the library's directory layout look like this (including any files directly in the bold directories):

  • my-library (name to be determined)
    • src
      • de
        • fencing_game
          • transport (my library)
            • protocol (part of the library)

The history should also contain just the part of the main project's history which touches this part of the repository.

A first look showed me git-subtree split --prefix=src/de/fencing_ame/transport, but this will

  1. give me a tree rooted in transport (which will not compile) and
  2. include the transport/client, transport/server and transport/fencing directories.

The first point could be mitigated by using git subtree add --prefix=src/de/fencing_ame/transport <commit> on the receiving side, but I don't think git-subtree can do much against exporting also these subdirectories. (The idea really is to just be able to share the complete tree here).

Do I have to use git filter-branch here?

After the split, I want to be able to import back the library in my main project, either using git-subtree or git-submodule, in a separate subdirectory rather than where it is now. I imagine the layout this way

  • fencing-game (git workdir)
    • src
      • de
        • fencing_game
          • transport (empty)
            • fencing (part of the main project interfacing with the library)
            • client (part of the main project interfacing with the library)
            • server (part of the main project interfacing with the library)
          • client (part of the main project)
          • server (part of the main project)
          • ... (part of the main project)
    • my-library
      • src
        • de
          • fencing_game
            • transport (my library)
              • protocol (part of the library)
    • other files and directories (build system, website and such - part of the main project)

What would be the most pain-free way to do this? Are there other tools than git-subtree and git-filter-branch for this goal?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

岁月无声 2024-11-23 07:23:13

我认为你有一些真正的洞穴探险要做。如果你只是想拆分“协议”,你可以使用“git subtree split ...”或“git filter-branch ...”来

实现git filter-branch --subdirectory-filter
fencing-game/src/de/fencing_game/transport/protocol -- --all

但是如果你在传输和传输/协议中都有文件,它就会开始变得毛茸茸的。

我编写了一些自定义工具来为我正在进行的项目执行此操作。它们没有在任何地方发布,但您可以使用 reposurgeon 执行类似的操作。

I think you've got some real spelunking to do. If you just want to split off "protocol", you can do that with "git subtree split ..." or "git filter-branch ..."

git filter-branch --subdirectory-filter
fencing-game/src/de/fencing_game/transport/protocol -- --all

But if you have files in transport as well as transport/protocol, it starts to get hairy.

I wrote some custom tools to do this for a project I was on. They're not published anywhere, but you can do something similar with reposurgeon.

花伊自在美 2024-11-23 07:23:13

拆分与父项目中的文件混合的子树

这似乎是一个常见的请求,但是当文件夹混合在一起时,我认为没有一个简单答案那。

我建议将库与其他文件夹混合在一起的一般方法是这样的:

  1. 使用库的新根创建一个分支:

    git 子树 split -P src/de/fencing_game -b temp-br
    git checkout temp-br git checkout temp-br
    
    # -或者-,如果你确实想保留完整路径:
    
    git checkout -b 临时-br
    cd src/de/fencing_game
    
  2. 然后使用某些东西来重写历史记录以删除不属于库的部分。我不是这方面的专家,但我能够进行实验并发现类似的方法可以工作:

    git filter-branch --tag-name-filter cat --prune-empty --index-filter 'git rm -rf --cached --ignore-unmatch 客户端服务器 otherstuff' HEAD
    
    # 还清除子目录中的内容
    光盘传输 
    git filter-branch --tag-name-filter cat --prune-empty --index-filter 'git rm -rf --cached --ignore-unmatch fencing 客户端服务器' HEAD
    

    注意:您可能需要删除连续命令之间由 filter-branch 所做的备份。

    git update-ref -d refs/original/refs/heads/temp-br
    
  3. 最后,只需为库创建一个新的存储库并提取剩下的所有内容:

    cd ;
    git初始化
    git pull ;温度-br
    

我建议您的最终库路径更像 /transport/protocol 而不是完整的父项目路径,因为这似乎与项目相关。

Splitting a subtree mixed with files from the parent project

This seems to be a common request, however I don't think there's a simple answer, when the folders are mixed together like that.

The general method I suggest to split out the library mixed in with other folders is this:

  1. Make a branch with the new root for the library:

    git subtree split -P src/de/fencing_game -b temp-br
    git checkout temp-br
    
    # -or-, if you really want to keep the full path:
    
    git checkout -b temp-br
    cd src/de/fencing_game
    
  2. Then use something to re-write history to remove the parts that aren't part of the library. I'm not expert on this but I was able to experiment and found something like this to work:

    git filter-branch --tag-name-filter cat --prune-empty --index-filter 'git rm -rf --cached --ignore-unmatch client server otherstuff' HEAD
    
    # also clear out stuff from the sub dir
    cd transport 
    git filter-branch --tag-name-filter cat --prune-empty --index-filter 'git rm -rf --cached --ignore-unmatch fencing client server' HEAD
    

    Note: You might need to delete the back-up made by filter-branch between successive commands.

    git update-ref -d refs/original/refs/heads/temp-br
    
  3. Lastly, just create a new repo for the library and pull in everything that's left:

    cd <new-lib-repo>
    git init
    git pull <original-repo> temp-br
    

I recommend that your final library path be more like /transport/protocol instead of the full parent project path since that seems kind of tied to the project.

羁拥 2024-11-23 07:23:13

这里的问题是,你的库中的内容和不包含的内容没有很好的区分。我强烈建议重构该解决方案,然后您可以将该库作为子模块包含在内。

如果其他开发人员仅在同一个存储库中重用此库,则只需在单独的分支上跟踪这些更改,而不必担心其他存储库。

The issue here is that there is no good separation of what is and isn't part of your library. I would strongly suggest that the solution is refactored and then you can just include the library as a submodule.

If the reuse of this library will be just in the same repo by other devs, just track those changes on a separate branch and don't bother with additional repos.

百合的盛世恋 2024-11-23 07:23:13

项目的历史只是为了你的利益,还是为了github上的人的利益?

如果历史只是为了您的利益,有一个简单的方法使用移植物。基本上,只需为 github 创建一个全新的存储库,删除所有专有代码。现在您拥有了一个仅包含公共代码的开源存储库,您可以将其推送到 github。在开源存储库的本地副本中,您可以将专有存储库中的历史记录移植到开源存储库上。

这样做意味着您(或有权访问专有存储库的任何人)可以看到完整的历史记录,但公众只能看到您开源代码时的代码。

.git/info/grafts 的用途是什么?

Will the history of the project be for your benefit only, or for the benefit of people on github?

If the history is for your benefit only, there is a simple way using grafts. Basically, just create a brand new repository for github, removing all proprietary code. Now you have an open source repo with only public code which you can push to github. In your local copy of the open source repo, you can graft the history from the proprietary repo onto the open source repo.

Doing it this way means you (or anyone with access to the proprietary repo) have the benefit of seeing the full history, but the general public will only see the code from the point you open sourced it.

What are .git/info/grafts for?

自由如风 2024-11-23 07:23:13

我做了类似的事情,但是将几个目录的内容分成加密分区(/ secure / tmp / newrepo)上的一个完全独立的存储库,因此笔记本电脑窃贼无法使用它们。
我编写了 shell 脚本,然后执行了以下操作:
git filter-branch --tree-filter '~/bin/tryit /secure/tmp/newrepo 个人私有' -- 95768021ff00216855868d12556137115b2789610..HEAD
(SHA 避免在任一目录存在之前进行提交)


#!/bin/sh
# to be used with  e.g:
# git filter-branch --tree-filter '~/bin/tryit /secure/tmp/newrepo personal private' 
# Don't do it on any repository you can't repeatedly do: 
#   rm -rf foo ; git clone /wherever/is/foo 
# when it breaks
SRC=`pwd`
DEST=$1
shift
MSG=/dev/shm/msg.txt
TAR=/dev/shm/tmp.tar
LIST=/dev/shm/list.txt
LOG=/dev/shm/log
DONE=''

echo $GIT_AUTHOR_DATE >> $LOG
git show --raw $GIT_COMMIT > $MSG 

for A in $* 
do 

if [ -d $A ] 
then 
DONE=${DONE}x
tar -cf $TAR $A 
tar -tf $TAR > ${LIST}
cat ${LIST} >> ${LOG}
rm -rf ${A}
cd ${DEST}
tar -xf $TAR
else
echo $A non-existant >> ${LOG}
fi
cd $SRC
done

if [ -z "${DONE}" ]
then
echo Empty >>$LOG
else
cd ${DEST}
unset GIT_INDEX_FILE
unset GIT_DIR
unset GIT_COMMIT
unset GIT_WORK_TREE
touch foo
git add .
git commit -a -F ${MSG}  >> ${LOG}
fi
exit 0

出于您的目的,您可能希望对 tar 有不同的规范(例如 --exclude= ),然后使用 cat ${LIST} | xargs rm 仅删除 tar 中的内容,但我希望正确地做到这一点并不太棘手。

未设置的内容和 exit 0 很重要,因为 filter-branch 将它们设置为您的源存储库(不是您想要的!),并且如果 sh 从脚本中的最后一个命令传递非零退出代码,则会终止。

I've done something similar, but splitting several dirs of stuff into an entirely separate repo on an encrypted partition (/secure/tmp/newrepo), so they were not available to a laptop thief.
I wrote the shell script and then did:
git filter-branch --tree-filter '~/bin/tryit /secure/tmp/newrepo personal private' -- 95768021ff00216855868d12556137115b2789610..HEAD
(the SHA avoids commits before either directory came into existance)


#!/bin/sh
# to be used with  e.g:
# git filter-branch --tree-filter '~/bin/tryit /secure/tmp/newrepo personal private' 
# Don't do it on any repository you can't repeatedly do: 
#   rm -rf foo ; git clone /wherever/is/foo 
# when it breaks
SRC=`pwd`
DEST=$1
shift
MSG=/dev/shm/msg.txt
TAR=/dev/shm/tmp.tar
LIST=/dev/shm/list.txt
LOG=/dev/shm/log
DONE=''

echo $GIT_AUTHOR_DATE >> $LOG
git show --raw $GIT_COMMIT > $MSG 

for A in $* 
do 

if [ -d $A ] 
then 
DONE=${DONE}x
tar -cf $TAR $A 
tar -tf $TAR > ${LIST}
cat ${LIST} >> ${LOG}
rm -rf ${A}
cd ${DEST}
tar -xf $TAR
else
echo $A non-existant >> ${LOG}
fi
cd $SRC
done

if [ -z "${DONE}" ]
then
echo Empty >>$LOG
else
cd ${DEST}
unset GIT_INDEX_FILE
unset GIT_DIR
unset GIT_COMMIT
unset GIT_WORK_TREE
touch foo
git add .
git commit -a -F ${MSG}  >> ${LOG}
fi
exit 0

For your purposes you'd probably want to have a different spec for the tar (e.g. --exclude= ) and then use cat ${LIST} | xargs rm to only remove stuff in the tar, but getting that right is not too tricky, I hope.

The unset stuff and exit 0 are important, since filter-branch sets those to your source repo (not what you want!) and will die if sh passes on a non-zero exit code from the last command in your script.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文