如何对 Mercurial 存储库进行全文索引?
当 hg log -k
不够,并且 hg grep
太慢(cca.100k 变更集)时该怎么办?我们在 Fisheye 方面的经历非常糟糕(太慢了),而 Kiln 似乎把我们与 FogCreek 帝国联系得有点太多了。
还有哪些其他选项可以通过存储库提供全文搜索功能?
What to do when hg log -k
is not sufficient, and hg grep
is just way too slow (cca. 100k changesets)? We have very bad experiences with Fisheye (way too slow), and Kiln seems to tie us into the FogCreek empire just a little bit too much.
What other options are there to provide full-text search capabilities over a repository?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您在全文搜索中寻找什么?如果您想知道添加文本时的修订版,那就更容易了,如果您想知道其中存在文本的所有修订版,那就更大了。
一般来说,hg grep 的速度与您无需预先构建索引或至少预先构建可以使用传统 grep 的版本化文件的速度一样快。
如果你愿意预先构建一个 grepable 文件结构,你可以这样做:
这会将每个变更集导出到适合使用普通命令行 grep 进行 grep 或使用 lucene 或类似工具进行索引的文本文件。您可以使用
changeset
挂钩轻松保持当前状态。仅通过更改集差异,您可以查找添加或删除文本的修订,但无法查找该文本存在的所有修订的列表。为此,您可以在每次修订时预先创建每个文件的副本,但即使很容易实现自动化,这也会占用大量空间。
如果您正在寻找发生某些情况的特定修订版,另一个选择是确保您熟悉
hg bisect
。它会自动为您进行二进制搜索,因此,如果您想查找包含字符串CHEESE
的第一个版本,您可以执行以下操作:虽然这会更新您的工作目录,但
hg grep
> 没有。What are you looking for in a full-text search? If you want to know the revision when text was added that's easier, and if you want to know all revisions in which text exists that's bigger.
Generally
hg grep
is as fast as you're going to get without pre-building an index, or at least pre-building versioned files you can use traditional grep on.If you're willing to pre-build a greppable file structure you could do something like this:
That would export each changeset out to a textfile suitable for grepping using normal command line grep or indexing using lucene or similar. You could easily keep that current with a
changeset
hook.Having only the changset diffs lets you look for revisions where text was added or removed, but not the list of all revisions where that text existed. For that you could pre-create a copy of every file at every revision, but that's a lot of space even if it's easily to automate.
Another option if you're looking for a specific revision where something happened is to make sure you're conversant with
hg bisect
. It automates a binary search for you, so if you want to find the first revision that has the stringCHEESE
you could do something like:though that updates your working dir which
hg grep
doesn't.你看过罗德代码吗? -- http://demo.rhodecode.org/
Have you looked at RhodeCode? -- http://demo.rhodecode.org/