用于大型源代码集的文本搜索工具,具有最新的预索引?
我正在维护中型 C++ 项目的几个分支(每个分支约 15k 个文件)。 我经常必须在所有项目文件中搜索给定的字符串或正则表达式。 目前我正在使用 Total Commander,它具有我想要的所有功能(区分大小写、正则表达式、文件名掩码),但该工具每次都会扫描所有文件,因此需要花费太多时间。
您知道任何文本搜索工具可以预先索引整个源代码树并允许快速查找模式吗? 返回所有匹配的文件是必须的,预览找到的模式环境会很好。 当然,当事情发生变化时,索引必须立即更新。
Visual Studio 搜索还不够,它仅扫描源文件(不是元数据或自定义资源)。
这样的工具存在吗? 我使用的是Windows XP。
编辑:我找到了非常有用的工具,请参阅我自己的答案
I'm doing maintenance of a few branches of middle-size C++ project (~15k files for each branch). Very often I have to search all project files for given string or regex. Currently I'm using Total Commander which has all features I want (case-sensitive, regexes, filename masks) but this tool scans all files every time, so it takes a bit too much time.
Do you know any text search tool, which could pre-index whole source tree and allow quick pattern finding? Returning all matching files is a must, preview of found pattern surroundings would be nice. Of course indexes must be updated instantly when something changes.
Visual Studio search is not enough, it only scans source files (not metadata nor custom resources).
Does such tool exists? I'm using Windows XP.
EDIT: I've found very usable tool, see my own answer
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
我在这里找到了非常有用的工具:http://code.google.com/p/ndexer/
我推荐给大家!
I found very usable tool here: http://code.google.com/p/ndexer/
I recommend it to everyone!
cscope 能够索引 c 文件,并且在某种程度上也能够索引 c++。
我个人使用 KDE 前端 KScope,它比 cscope 的 ui 更用户友好。
除此之外,您可能还想看看 OpenGrok
cscope is able to index c files and to some extent c++ as well.
I personally use the KDE front-end KScope which is user-friendlier than cscope's ui.
apart from that you might want to have a look at OpenGrok
我不确定(我没有经验),但我会尝试一下 Eclipse CDT。 它索引所有源以快速查找符号,类似于 Eclipse JDT(Java 工具)。
I don't know for sure (I have no experience), but I would give Eclipse CDT a try. It indexes all your sources for fast lookup of symbols, similar to Eclipse JDT (Java tools).
回避该问题的一种直接方法是将所有源代码放在 RAM 磁盘上。 通过以这种方式加速文件 IO,您将看到性能的大幅提升,而无需更改工具链。
One straightforward way to sidestep the problem is to put all the source code on a RAM disk. By speeding up the file IO in this way, you'll see a big jump in performance without otherwise changing your tool chain.
您可以尝试 Google 桌面搜索,或者 Lucene 或 clucene(lucene 移植到 c++)作为通用索引工具。
You could try Google desktop search, or how about Lucene or clucene (lucene ported to c++) as a general-purpose indexing tool.
请参见
http://aaron.oirt.rutgers.edu/myapp/文档/W1300.testAndDemo
并点击“搜索演示”链接进入代码树搜索演示。
这将完全按照您的要求进行
它是 WHIFF 的标准演示组件。 你将会拥有
添加 Pygments 插件以从中提取感兴趣的字符串
你的二进制文件(或者只是将整个文件读取为“文本”)——开箱即用
索引器将忽略 Pygments 无法识别的文件。
有一个简单的技巧可以让它“吃掉所有东西”——如果你有的话请告诉我
想要更多信息。
描述了安装和用于搜索的命令行界面
在
http://aaron.oirt.rutgers.edu/myapp/ docs/W1300_1000.search。
Please see
http://aaron.oirt.rutgers.edu/myapp/docs/W1300.testAndDemo
and follow the "search demo" link to the code tree search demo.
This will do exactly what you are asking
for and it is a standard demo component of WHIFF. You will have
to add Pygments plug ins to pull out strings of interest from
your binary files (or just read the whole file as 'text') -- out-of-the-box
the indexer will ignore files Pygments doesn't recognize.
There is an easy hack to make it "eat everything" -- let me know if you
want more info.
The installation and command line interfaces for searching are described
at
http://aaron.oirt.rutgers.edu/myapp/docs/W1300_1000.search.
Windows Indexing Service 似乎符合您的所有条件。
Windows Indexing Service appears to match all your criteria.
我知道您不一定需要 Web 应用程序,但请尝试 Open Grok。
I know you don't want a webapp necessarily, but try Open Grok.