Java 中的 toLowerCase() 方法与 Locale 一起使用时不会产生准确的结果

发布于 2024-12-20 17:53:15 字数 698 浏览 10 评论 0原文

请看下面的 Java 代码片段。

final public class Main
{
    public static void main(String[] args) 
    {
        Locale.setDefault(new Locale("lt")); 
        String str = "\u00cc";   //setting Lithuanian as locale

        System.out.println("Before case conversion is "+str+" and length is "+str.length());// Ì
        String lowerCaseStr = str.toLowerCase();
        System.out.println("Lower case is "+lowerCaseStr+" and length is "+lowerCaseStr.length());// i?`
    }
}

它显示以下输出。

大小写转换前为 Ì,长度为 1

小写为 i̇̀,长度为 3


在第一个 System.out.println() 语句中,结果是准确的。然而,在第二个语句中,它显示的长度3实际上应该是1。我不明白,为什么?

Look at the following code snippet in Java.

final public class Main
{
    public static void main(String[] args) 
    {
        Locale.setDefault(new Locale("lt")); 
        String str = "\u00cc";   //setting Lithuanian as locale

        System.out.println("Before case conversion is "+str+" and length is "+str.length());// Ì
        String lowerCaseStr = str.toLowerCase();
        System.out.println("Lower case is "+lowerCaseStr+" and length is "+lowerCaseStr.length());// i?`
    }
}

It displays the following output.

Before case conversion is Ì and length is 1

Lower case is i̇̀ and length is 3


In the first System.out.println() statement, the result is exact. In the second statement, however, it displays the length 3 which actually should have been 1. I don't understand, Why?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

如何视而不见 2024-12-27 17:53:15

不同的语言有不同的大小写转换规则。

例如,在德语中,小写的 ß 会变成两个大写的 S,因此 6 个字符长的单词“straße”(街道)会变成 7 个字符长的“STRASSE”。

这就是为什么大写和小写字符串具有不同的长度。

我在一项 Java 测验中写到了这一点:
http: //thecodersbreakfast.net/index.php?post/2010/09/24/Java-Quiz-42-%3A-A-string-too-far

Different languages have different rules to transform to upper- or lower-case.

For example, in German, the lowercase ß becomes two uppercase S, so the word "straße" (a street), which is 6 characters long, becomes "STRASSE", which is 7 characters long.

This is why your upper-cased and lower-cased strings have different lengths.

I wrote about this in one of my Java Quiz :
http://thecodersbreakfast.net/index.php?post/2010/09/24/Java-Quiz-42-%3A-A-string-too-far

深海少女心 2024-12-27 17:53:15

我得到不同的结果:

Before case conversion is Ì and length is 1
Lower case is i?? and length is 3

I get a different result:

Before case conversion is Ì and length is 1
Lower case is i?? and length is 3
仅此而已 2024-12-27 17:53:15

它与 Java 的 toLowerCase() 保留原始字符串长度吗?。这是非常有帮助的并且有非常详细的答案。
str 和 str.toLowerCase() 的长度并不总是相同,因为转换取决于每个字符的代码。

在本例中,第二个输出是“小写字母是 i??,长度为 3”。它落后于两个?标记所以长度为3。

It is quite duplicate of Does Java's toLowerCase() preserve original string length?. It is very helpful and having answer in very details.
the length of str and str.toLowerCase() are not always same because the converstion depend on the code of each char.

In this case the second output is "Lower case is i?? and length is 3". it is trailed by two ? mark so length is 3.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文