GREP - 查找字符串的所有出现位置

发布于 2024-08-12 01:35:34 字数 718 浏览 11 评论 0原文

我的任务是对应用程序进行白色标记，以便它不包含对我们公司、网站等的引用。我遇到的问题是我有许多不同的模式需要查找，并且希望保证所有模式都被删除。由于该应用程序不是内部开发的（完全），我们不能简单地查找 messages.properties 中的出现并完成。我们必须检查 JSP、Java 代码和 xml。

我使用 grep 来过滤这样的结果：

grep SOME_PATTERN . -ir | grep -v import | grep -v // | grep -v /* ...

当我在命令行上使用它们时，这些模式会被转义；但是，我不认为这种模式匹配非常强大。可能会出现导入（不太可能）甚至 /* （javadoc 注释的开头）的情况。

输出到屏幕的所有文本都必须来自某处的字符串声明或常量文件。因此，我可以假设我会找到类似的内容：

public static final String SOME_CONSTANT = "SOME_PATTERN is currently unavailable";

我想找到该事件以及：

public static final String SOME_CONSTANT = "
SOME_PATTERN blah blah blah";

或者，如果我们有内部爬虫/自动化测试，我可以简单地从每个页面拉回 xhtml 并检查源代码以确保很干净。

原文

I am tasked with white labeling an application so that it contains no references to our company, website, etc. The problem I am running into is that I have many different patterns to look for and would like to guarantee that all patterns are removed. Since the application was not developed in-house (entirely) we cannot simply look for occurrences in messages.properties and be done. We must go through JSP's, Java code, and xml.

I am using grep to filter results like this:

grep SOME_PATTERN . -ir | grep -v import | grep -v // | grep -v /* ...

The patterns are escaped when I'm using them on the command line; however, I don't feel this pattern matching is very robust. There could possibly be occurrences that have import in them (unlikely) or even /* (the beginning of a javadoc comment).

All of the text output to the screen must come from a string declaration somewhere or a constants file. So, I can assume I will find something like:

public static final String SOME_CONSTANT = "SOME_PATTERN is currently unavailable";

I would like to find that occurrence as well as:

public static final String SOME_CONSTANT = "
SOME_PATTERN blah blah blah";

Alternatively, if we had an internal crawler / automated tests, I could simply pull back the xhtml from each page and check the source to ensure it was clean.

分享到QQ

分享到微博