查找字符串中子字符串的出现次数
为什么以下算法不会对我停止?
在下面的代码中,str
是我要搜索的字符串,findStr
是我要查找的字符串出现次数。
String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int lastIndex = 0;
int count = 0;
while (lastIndex != -1) {
lastIndex = str.indexOf(findStr,lastIndex);
if( lastIndex != -1)
count++;
lastIndex += findStr.length();
}
System.out.println(count);
Why is the following algorithm not halting for me?
In the code below, str
is the string I am searching in, and findStr
is the string occurrences of which I'm trying to find.
String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int lastIndex = 0;
int count = 0;
while (lastIndex != -1) {
lastIndex = str.indexOf(findStr,lastIndex);
if( lastIndex != -1)
count++;
lastIndex += findStr.length();
}
System.out.println(count);
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(30)
您的
lastIndex += findStr.length();
被放置在括号之外,导致无限循环(当未找到任何情况时,lastIndex 始终为findStr.length()
)。这是固定版本:
Your
lastIndex += findStr.length();
was placed outside the brackets, causing an infinite loop (when no occurence was found, lastIndex was always tofindStr.length()
).Here is the fixed version :
较短的版本。 ;)
A shorter version. ;)
最后一行造成了问题。
lastIndex
永远不会是 -1,所以会出现无限循环。 可以通过将最后一行代码移至 if 块来解决此问题。The last line was creating a problem.
lastIndex
would never be at -1, so there would be an infinite loop. This can be fixed by moving the last line of code into the if block.您真的需要自己处理匹配吗? 特别是如果您需要的只是出现的次数,则正则表达式更加整洁:
Do you really have to handle the matching yourself ? Especially if all you need is the number of occurences, regular expressions are tidier :
我很惊讶没有人提到这一班轮。 它简单、简洁,并且性能略优于
str.split(target, -1).length-1
I'm very surprised no one has mentioned this one liner. It's simple, concise and performs slightly better than
str.split(target, -1).length-1
在这里,它被封装在一个漂亮且可重用的方法中:
Here it is, wrapped up in a nice and reusable method:
循环结束时计数为 3; 希望能帮助到你
at the end of the loop count is 3; hope it helps
许多给出的答案在以下一项或多项方面失败:
这是我写的:
示例调用:
如果您想要非正则表达式搜索,只需使用
LITERAL
标志适当地编译您的模式:A lot of the given answers fail on one or more of:
Here's what I wrote:
Example call:
If you want a non-regular-expression search, just compile your pattern appropriately with the
LITERAL
flag:您可以使用内置库函数计算出现次数:
You can number of occurrences using inbuilt library function:
每当您查找下一个匹配项时,都会增加
lastIndex
。否则它总是找到第一个子字符串(位于位置 0)。
Increment
lastIndex
whenever you look for next occurrence.Otherwise it's always finding the first substring (at position 0).
Matcher.results()
您可以使用 Java 9 方法
Matcher.results()
和单个行代码。它生成与捕获的子字符串相对应的
MatchResult
对象流,唯一需要的是应用Stream.count()
获取数量流中的元素。main()
输出:
Matcher.results()
You can find the number of occurrences of a substring in a string using Java 9 method
Matcher.results()
with a single line of code.It produces a Stream of
MatchResult
objects which correspond to captured substrings, and the only thing needed is to applyStream.count()
to obtain the number of elements in the stream.main()
Output:
给出的正确答案对于计算行返回之类的东西没有好处,而且太冗长了。 后来的答案更好,但所有这些都可以通过
使用问题中的示例来实现,它不会删除尾随匹配项。
The answer given as correct is no good for counting things like line returns and is far too verbose. Later answers are better but all can be achieved simply with
It does not drop trailing matches using the example in the question.
返回此字符串中指定字符第一次出现的索引,从指定索引处开始搜索。
因此,您的
lastindex
值始终为 0,并且它总是在字符串中找到 hello。Returns the index within this string of the first occurrence of the specified character, starting the search at the specified index.
So your
lastindex
value is always 0 and it always finds hello in the string.尝试将
lastIndex+=findStr.length()
添加到循环末尾,否则您将陷入无限循环,因为一旦找到子字符串,您就会尝试一次又一次地从相同的最后位置。try adding
lastIndex+=findStr.length()
to the end of your loop, otherwise you will end up in an endless loop because once you found the substring, you are trying to find it again and again from the same last position.试试这个。 它将所有匹配项替换为
-
。如果您不想破坏您的
str
您可以创建一个具有相同内容的新字符串:执行此块后,这些将是您的值:
Try this one. It replaces all the matches with a
-
.And if you don't want to destroy your
str
you can create a new string with the same content:After executing this block these will be your values:
正如@Mr_and_Mrs_D 建议的:
As @Mr_and_Mrs_D suggested:
根据现有的答案,我想添加一个不带 if 的“较短”版本:
Based on the existing answer(s) I'd like to add a "shorter" version without the if:
这是用于计算令牌在用户输入的字符串中出现的次数的高级版本:
Here is the advanced version for counting how many times the token occurred in a user entered string:
下面的方法显示了子字符串在整个字符串上重复的次数。 希望对您有用:-
This below method show how many time substring repeat on ur whole string. Hope use full to you:-
该解决方案打印给定子字符串在整个字符串中出现的总数,还包括确实存在重叠匹配的情况。
This solution prints the total number of occurrence of a given substring throughout the string, also includes the cases where overlapping matches do exist.
这是不使用 regexp/patterns/matchers 甚至不使用 StringUtils 的另一个解决方案。
here is the other solution without using regexp/patterns/matchers or even not using StringUtils.
如果您需要原始字符串中每个子字符串的索引,您可以使用 indexOf 执行以下操作:
If you need the index of each substring within the original string, you can do something with indexOf like this:
}
}
???? Just a little more peachy answer
刚才面试时被问到这个问题,我当时一片空白。 (像往常一样,我告诉自己,面试结束的那一刻我就会得到解决方案)我做到了,通话结束后 5 分钟:(
I was asked this question in an interview just now and I went completely blank. (Like always, I told myself that the moment the interview ends ill get the solution) which I did, 5 mins after the call ended :(
这个问题的最佳解决方案可以在
org.springframework.util.StringUtils.countOccurrencesOf(string, substring)
中找到:有基于JMH的性能比较(完整报告:https://medium.com/p/d924cf933fc3):
The best solution to this problem you can find in
org.springframework.util.StringUtils.countOccurrencesOf(string, substring)
:There is performance comparison based on JMH (full report: https://medium.com/p/d924cf933fc3):
抱歉,几乎所有答案都是未经优化的。 这是一道正确的算法题。
给定一个大小为 N 的字符串。包含来自某个集合 C 的字符。我们称这个集合为字母表。
给定任何大小为 M 的模式。包含来自同一集合 C 的字符。
然后有两种方法:
请查看
两者的成本都在 = Thetha(N + M) 左右,但 KMP 在某些情况下表现更好。
Sorry almost all are unoptimized answers. This is a proper algorithm question.
Given a String of Size N. Containg characters from some set C. Lets call this set Alphabet.
Given any pattern of Size M. Containing characters from same set C.
Then 2 approaches :
Please look into
Both Cost around = Thetha(N + M) but KMP performs better in some scenarios.
如何使用 StringUtils.countMatches 来自 Apache Commons Lang?
输出:
How about using StringUtils.countMatches from Apache Commons Lang?
That outputs: