如何提取 git 子目录并从中创建子模块?
几个月前我开始了一个项目,并将所有内容存储在主目录中。 在我的主目录“Project”中,有几个包含不同内容的子目录: 项目/论文包含用 LaTeX 编写的文档 Project/sourcecode/RailsApp 包含我的 Rails 应用程序。
“Project”是 GITified,并且“paper”和“RailsApp”目录中都有大量提交。 现在,由于我想将 Cruisecontrol.rb 用于我的“RailsApp”,我想知道是否有一种方法可以在不丢失历史记录的情况下从“RailsApp”创建子模块。
I started a project some months ago and stored everything within a main directory.
In my main directory "Project" there are several subdirectories containing different things:
Project/paper contains a document written in LaTeX
Project/sourcecode/RailsApp contains my rails app.
"Project" is GITified and there have been a lot of commits in both "paper" and "RailsApp" directory. Now, as I'd like to use cruisecontrol.rb for my "RailsApp" I wonder if there is a way to make a submodule out of "RailsApp" without losing the history.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
现在有一种比手动使用 git filter-branch 更简单的方法: git subtree
安装
< strong>注意 自 1.7.11 起,
git-subtree
现在是git
的一部分(如果您安装 contrib),因此您可能已经安装了它。 您可以通过执行git subtree
来检查。要从源代码安装 git-subtree(对于旧版本的 git):
或者如果您想要手册页和所有
用法
将较大的块拆分为较小的块:
有关详细文档(手册页),请阅读
git-subtree.txt
。Nowadays there's a much easier way to do it than manually using git filter-branch: git subtree
Installation
NOTE
git-subtree
is now part ofgit
(if you install contrib) as of 1.7.11, so you might already have it installed. You may check by executinggit subtree
.To install git-subtree from source (for older versions of git):
Or if you want the man pages and all
Usage
Split a larger into smaller chunks:
For detailed documentation (man page), please read
git-subtree.txt
.查看 git filter-branch。
手册页的
示例
部分显示如何将子目录提取到它自己的项目中,同时保留其所有历史记录并丢弃其他文件/目录的历史记录(正是您要查找的内容)。Checkout git filter-branch.
The
Examples
section of the man page shows how to extract a sub-directory into it's own project while keeping all of it's history and discarding history of other files/directories (just what you're looking for).执行此操作的一种方法是相反 - 删除除要保留的文件之外的所有内容。
基本上,复制存储库,然后使用
git filter-branch
删除除您想要保留的文件/文件夹之外的所有内容。例如,我有一个项目,我希望将文件
tvnamer.py
提取到新存储库:它使用
git filter-branch --tree-filter
来执行通过每次提交,运行命令并重新提交生成的目录内容。 这是极具破坏性的(所以你应该只在存储库的副本上执行此操作!),并且可能需要一段时间(在具有 300 次提交和大约 20 个文件的存储库上大约需要 1 分钟)上面的命令仅运行以下 shell 脚本在每个修订版上,您当然必须对其进行修改(以使其排除您的子目录而不是
tvnamer.py
):最大的明显问题是它留下了所有提交消息,即使它们与其余文件无关。 脚本 git-remove-empty-commits ,修复了这个问题..
您需要使用
-f
强制参数再次运行filter-branch
以及refs/original/
中的任何内容(基本上备份)当然,这永远不会是完美的,例如,如果您的提交消息提到其他文件,但它大约是 git 当前允许的最接近的值(据我所知)。
再次强调,只能在存储库的副本上运行此命令! - 但总而言之,要删除除“thisismyfilename.txt”之外的所有文件:
One way of doing this is the inverse - remove everything but the file you want to keep.
Basically, make a copy of the repository, then use
git filter-branch
to remove everything but the file/folders you want to keep.For example, I have a project from which I wish to extract the file
tvnamer.py
to a new repository:That uses
git filter-branch --tree-filter
to go through each commit, run the command and recommit the resulting directories content. This is extremely destructive (so you should only do this on a copy of your repository!), and can take a while (about 1 minute on a repository with 300 commits and about 20 files)The above command just runs the following shell-script on each revision, which you'd have to modify of course (to make it exclude your sub-directory instead of
tvnamer.py
):The biggest obvious problem is it leaves all commit messages, even if they are unrelated to the remaining file. The script git-remove-empty-commits, fixes this..
You need to use the
-f
force argument runfilter-branch
again with anything inrefs/original/
(which basically a backup)Of course this will never be perfect, for example if your commit messages mention other files, but it's about as close a git current allows (as far as I'm aware anyway).
Again, only ever run this on a copy of your repository! - but in summary, to remove all files but "thisismyfilename.txt":
CoolAJ86 和 apenwarr 答案非常相似。 我在两者之间来回走动,试图理解两者中缺失的部分。 下面是它们的组合。
首先将 Git Bash 导航到要拆分的 git 存储库的根目录。 在我的示例中,这是
~/Documents/OriginalRepo (master)
下面是上面的副本,替换了可自定义名称并使用 https 代替。 根文件夹现在是
~/Documents/_Shawn/UnityProjects/SoProject (master)
Both CoolAJ86 and apenwarr answers are very similar. I went back and forth between the two trying to understand bits that were missing from either one. Below is a combination of them.
First navigate Git Bash to the root of the git repo to be split. In my example here that is
~/Documents/OriginalRepo (master)
Below is a copy of above with the customize-able names replaced and using https instead. Root folder is now
~/Documents/_Shawn/UnityProjects/SoProject (master)
如果您想将某些文件子集传输到新存储库但保留历史记录,那么您基本上最终会得到一个全新的历史记录。 其工作方式基本上如下:
如果您不介意编写一个小而复杂的脚本,那么自动化此操作应该比较简单。 是的,很简单,但也很痛苦。 过去人们已经在 Git 中完成了历史重写,你可以搜索一下。
或者:克隆存储库,并删除克隆中的论文,删除原始存储库中的应用程序。 这将需要一分钟,它保证有效,并且您可以回到比尝试净化 git 历史记录更重要的事情上。 并且不用担心历史记录的冗余副本占用的硬盘空间。
If you want to transfer some subset of files to a new repository but keep the history, you're basically going to end up with a completely new history. The way this would work is basically as follows:
It should be somewhat straightforward to automate this if you don't mind writing a small but hairy script. Straightforward, yes, but also painful. People have done history rewriting in Git in the past, you can do a search for that.
Alternatively: clone the repository, and delete the paper in the clone, delete the app in the original. This would take one minute, it's guaranteed to work, and you can get back to more important things than trying to purify your git history. And don't worry about the hard drive space taken up by redundant copies of history.