indexOf 区分大小写吗?
indexOf(String) 方法区分大小写吗? 如果是这样,是否有不区分大小写的版本?
Is the indexOf(String) method case sensitive? If so, is there a case insensitive version of it?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(19)
总结一下,3个解决方案:
现在,我想知道哪一个是最快的?
我猜平均第一个。
Just to sum it up, 3 solutions:
Now, what I was wondering was which one is the fastest?
I'm guessing on average the first one.
我想声明迄今为止唯一一个真正有效的解决方案。 :-)
必须处理三类问题。
小写和大写的非传递匹配规则。 土耳其语I问题在其他回复中也经常提到。 根据 Android 源代码中 String.regionMatches 的注释,格鲁吉亚比较规则在比较不区分大小写的相等性时需要额外转换为小写。
大写和小写形式的字母数量不同的情况。 在这些情况下,迄今为止发布的几乎所有解决方案都失败了。 示例:德语 STRASSE 与 Straße 不区分大小写,相等,但长度不同。
重音字符的绑定强度。 无论重音是否匹配,区域设置和上下文都会产生影响。 在法语中,“é”的大写形式是“E”,尽管存在使用大写重音符号的趋势。 在加拿大法语中,“é”的大写形式无一例外都是“É”。 两个国家/地区的用户在搜索时都希望“e”与“é”匹配。 重音字符和非重音字符是否匹配是特定于区域设置的。 现在考虑:“E”等于“É”吗? 是的。 确实如此。 无论如何,在法语环境中。
我目前正在使用
android.icu.text.StringSearch
来正确实现以前不区分大小写的 indexOf 操作的实现。非 Android 用户可以使用
com.ibm.icu.text.StringSearch
类通过 ICU4J 包访问相同的功能。请小心引用正确的 icu 包(
android.icu.text
或com.ibm.icu.text
)中的类,因为 Android 和 JRE 都具有相同的类其他命名空间中的名称(例如 Collator)。测试用例(语言环境、模式、目标文本、预期结果):
PS:据我所知,当语言环境特定规则根据字典规则区分重音字符和非重音字符时,PRIMARY 绑定强度应该做正确的事情; 但我不知道使用哪个区域设置来测试这个前提。 捐赠的测试用例将不胜感激。
--
版权声明:由于 StackOverflow 应用于代码片段的 CC-BY_SA 版权对于专业开发人员来说是行不通的,因此这些片段在此处根据更合适的许可证进行了双重许可:https://pastebin.com/1YhFWmnU
I would like to lay claim to the ONE and only solution posted so far that actually works. :-)
Three classes of problems that have to be dealt with.
Non-transitive matching rules for lower and uppercase. The Turkish I problem has been mentioned frequently in other replies. According to comments in Android source for String.regionMatches, the Georgian comparison rules requires additional conversion to lower-case when comparing for case-insensitive equality.
Cases where upper- and lower-case forms have a different number of letters. Pretty much all of the solutions posted so far fail, in these cases. Example: German STRASSE vs. Straße have case-insensitive equality, but have different lengths.
Binding strengths of accented characters. Locale AND context effect whether accents match or not. In French, the uppercase form of 'é' is 'E', although there is a movement toward using uppercase accents . In Canadian French, the upper-case form of 'é' is 'É', without exception. Users in both countries would expect "e" to match "é" when searching. Whether accented and unaccented characters match is locale-specific. Now consider: does "E" equal "É"? Yes. It does. In French locales, anyway.
I am currently using
android.icu.text.StringSearch
to correctly implement previous implementations of case-insensitive indexOf operations.Non-Android users can access the same functionality through the ICU4J package, using the
com.ibm.icu.text.StringSearch
class.Be careful to reference classes in the correct icu package (
android.icu.text
orcom.ibm.icu.text
) as Android and the JRE both have classes with the same name in other namespaces (e.g. Collator).Test Cases (Locale, pattern, target text, expectedResult):
PS: As best as I can determine, the PRIMARY binding strength should do the right thing when locale-specific rules differentiate between accented and non-accented characters according to dictionary rules; but I don't which locale to use to test this premise. Donated test cases would be gratefully appreciated.
--
Copyright notice: because StackOverflow's CC-BY_SA copyrights as applied to code-fragments are unworkable for professional developers, these fragments are dual licensed under more appropriate licenses here: https://pastebin.com/1YhFWmnU
但写一个并不难:
But it's not hard to write one:
将两个字符串转换为小写通常没什么大不了的,但如果某些字符串很长,则会很慢。 如果你循环执行此操作,那就非常糟糕了。 因此,我推荐
indexOfIgnoreCase
。Converting both strings to lower-case is usually not a big deal but it would be slow if some of the strings is long. And if you do this in a loop then it would be really bad. For this reason, I would recommend
indexOfIgnoreCase
.这是一个与 Apache 的 StringUtils 版本非常相似的版本:
Here's a version closely resembling Apache's StringUtils version:
indexOf 区分大小写。 这是因为它使用 equals 方法来比较列表中的元素。 包含和删除也是如此。
indexOf is case sensitive. This is because it uses the equals method to compare the elements in the list. The same thing goes for contains and remove.
indexOf()
方法均区分大小写。 您可以通过预先将字符串转换为大写/小写来使它们(粗略地,以一种破坏的方式,但适用于很多情况)不区分大小写:The
indexOf()
methods are all case-sensitive. You can make them (roughly, in a broken way, but working for plenty of cases) case-insensitive by converting your strings to upper/lower case beforehand:是的,区分大小写:
不,没有。 您可以在调用indexOf之前将两个字符串转换为小写:
Yes, it is case sensitive:
No, there isn't. You can convert both strings to lower case before calling indexOf:
Apache Commons Lang库的StringUtils类中有一个忽略大小写的方法
indexOfIgnoreCase(CharSequence str, CharSequence searchStr)
There is an ignore case method in StringUtils class of Apache Commons Lang library
indexOfIgnoreCase(CharSequence str, CharSequence searchStr)
是的,
indexOf
区分大小写。我发现的不区分大小写的最佳方法是:
这将执行不区分大小写的
indexOf()
。Yes,
indexOf
is case sensitive.The best way to do case insensivity I have found is:
That will do a case insensitive
indexOf()
.这是我的解决方案,它不分配任何堆内存,因此它应该比这里提到的大多数其他实现要快得多。
以下是验证正确行为的单元测试。
Here is my solution which does not allocate any heap memory, therefore it should be significantly faster than most of the other implementations mentioned here.
And here are the unit tests that verify correct behavior.
是的,它区分大小写。 您可以通过在搜索之前将 String 和 String 参数都转换为大写来执行不区分大小写的
indexOf
。请注意,toUpperCase 在某些情况下可能不起作用。 例如:
idxU 将是 20,这是错误的! idxL 将是 19,这是正确的。 导致问题的原因是 toUpperCase() 将“ß”字符转换为两个字符“SS”,这会导致索引关闭。
因此,始终坚持使用 toLowerCase()
Yes, it is case-sensitive. You can do a case-insensitive
indexOf
by converting your String and the String parameter both to upper-case before searching.Note that toUpperCase may not work in some circumstances. For instance this:
idxU will be 20, which is wrong! idxL will be 19, which is correct. What's causing the problem is tha toUpperCase() converts the "ß" character into TWO characters, "SS" and this throws the index off.
Consequently, always stick with toLowerCase()
返回后您将如何处理索引值?
如果您使用它来操作字符串,那么您不能使用正则表达式吗?
What are you doing with the index value once returned?
If you are using it to manipulate your string, then could you not use a regular expression instead?
我刚刚看了源码。 它比较字符,因此区分大小写。
I've just looked at the source. It compares chars so it is case sensitive.
是的,我相当肯定是这样。 使用标准库解决该问题的一种方法是:
Yes, I am fairly sure it is. One method of working around that using the standard library would be:
有同样的问题。
我尝试了正则表达式和 apache StringUtils.indexOfIgnoreCase-Method,但两者都非常慢......
所以我自己写了一个简短的方法......:
根据我的测试,它要快得多......(至少如果你的 searchString 相当短)。
如果您有任何改进建议或错误,请告诉我......(因为我在应用程序中使用此代码;-)
Had the same problem.
I tried regular expression and the apache StringUtils.indexOfIgnoreCase-Method, but both were pretty slow...
So I wrote an short method myself...:
According to my tests its much faster... (at least if your searchString is rather short).
if you have any suggestions for improvement or bugs it would be nice to let me know... (since I use this code in an application ;-)
第一个问题已经回答过很多次了。 是的,
String.indexOf()
方法都是区分大小写的。如果您需要区域设置敏感的
indexOf()
您可以使用 整理器. 根据您设置的强度值,您可以获得不区分大小写的比较,并将重音字母与非重音字母视为相同,等等。以下是如何执行此操作的示例:
The first question has already been answered many times. Yes, the
String.indexOf()
methods are all case-sensitive.If you need a locale-sensitive
indexOf()
you could use the Collator. Depending on the strength value you set you can get case insensitive comparison, and also treat accented letters as the same as the non-accented ones, etc.Here is an example of how to do this: