减号和单引号在 if 语句中起什么作用?

发布于 2024-11-07 16:00:50 字数 977 浏览 5 评论 0原文

在以下代码中,向比较的两个操作数添加相同的字母会更改结果。尽管-不大于j,但-k大于jk

仅当操作数之一是减号 (-) 或单引号 (') 时才会发生这种情况。

为什么会发生这种情况?规则是什么?

if - gtr j (echo - greater than j) else echo - less than j
if "-" gtr "j" (echo "-" greater than "j") else echo "-" less than "j"
echo.
if -k gtr jk (echo -k greater than jk) else echo -k less than jk
if "-k" gtr "jk" (echo "-k" greater than "jk") else echo "-k" less than "jk"
echo.
if ' gtr u (echo ' greater than u) else echo ' less than u
if "'" gtr "u" (echo "'" greater than "u") else echo "'" less than "u"
echo.
if 'v gtr uv (echo 'v greater than uv) else echo 'v less than uv
if "'v" gtr "uv" (echo "'v" greater than "uv") else echo "'v" less than "uv"

结果是:

- less than j
"-" less than "j"

-k greater than jk
"-k" greater than "jk"

' less than u
"'" less than "u"

'v greater than uv
"'v" greater than "uv"

In the following code, adding the same letter to both operands of the comparison changes the result. Despite - being not greater than j, -k is greater than jk.

This only happens if one of the operands is the minus sign (-) or single quotation mark (').

Why does this happen? What are the rules?

if - gtr j (echo - greater than j) else echo - less than j
if "-" gtr "j" (echo "-" greater than "j") else echo "-" less than "j"
echo.
if -k gtr jk (echo -k greater than jk) else echo -k less than jk
if "-k" gtr "jk" (echo "-k" greater than "jk") else echo "-k" less than "jk"
echo.
if ' gtr u (echo ' greater than u) else echo ' less than u
if "'" gtr "u" (echo "'" greater than "u") else echo "'" less than "u"
echo.
if 'v gtr uv (echo 'v greater than uv) else echo 'v less than uv
if "'v" gtr "uv" (echo "'v" greater than "uv") else echo "'v" less than "uv"

The result is:

- less than j
"-" less than "j"

-k greater than jk
"-k" greater than "jk"

' less than u
"'" less than "u"

'v greater than uv
"'v" greater than "uv"

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

姜生凉生 2024-11-14 16:00:50

您可能假设字符串只是逐个字符进行比较,并获取它们的序数值。

那不是真的。整理比这复杂得多。

事实上,您可以在其他环境(例如 Windows PowerShell)中看到相同的情况:

PS Home:\> '-' -gt 'j'
False
PS Home:\> '-k' -gt 'jk'
True
PS Home:\> '''' -gt 'u'
False
PS Home:\> '''v' -gt 'uv'
True

字符串的顺序很可能也随您的区域设置而变化。

至于您在这里的特定问题,引用 Unicode 排序算法 (UTS #10) :

一般来说,在串联或子字符串操作下不会保留排序规则。

例如,x 小于 y 并不意味着 x + z 小于 y + z,因为字符可能会跨子字符串或串联边界形成收缩。总结一下:

<块引用>

x < y 并不意味着 xz < yz
x < y 并不意味着 zx < zy
xz < yz 并不意味着 x < y
zx < zy 并不意味着 x <是的

并解决您可能遇到的误解:

排序规则不是代码点(二进制)顺序。

一个简单的例子是代码图表中大写 Z 位于小写 a 之前。如前所述,初学者可能会抱怨某个特定的 Unicode 字符“不在代码表中的正确位置”。这是对字符编码在排序规则中的作用的误解。虽然 Unicode 标准不会无缘无故地将字符放置为奇怪的二进制顺序,但获得语言上正确的顺序的唯一方法是使用语言敏感的排序规则,而不是二进制顺序。

You may be assuming that strings are just compared character by character, taking their ordinal values.

That's not true. Collation is much more complex than that.

In fact, you can see the same in other environments, such as Windows PowerShell:

PS Home:\> '-' -gt 'j'
False
PS Home:\> '-k' -gt 'jk'
True
PS Home:\> '''' -gt 'u'
False
PS Home:\> '''v' -gt 'uv'
True

It could very well be that the order of strings varies with your locale as well.

As for your particular problem here, quoting from the Unicode Collation Algorithm (UTS #10):

Collation order is not preserved under concatenation or substring operations, in general.

For example, the fact that x is less than y does not mean that x + z is less than y + z, because characters may form contractions across the substring or concatenation boundaries. In summary:

x < y does not imply that xz < yz
x < y does not imply that zx < zy
xz < yz does not imply that x < y
zx < zy does not imply that x < y

and to solve the misconveption you're likely under:

Collation is not code point (binary) order.

A simple example of this is the fact that capital Z comes before lowercase a in the code charts. As noted earlier, beginners may complain that a particular Unicode character is “not in the right place in the code chart.” That is a misunderstanding of the role of the character encoding in collation. While the Unicode Standard does not gratuitously place characters such that the binary ordering is odd, the only way to get the linguistically-correct order is to use a language-sensitive collation, not a binary ordering.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文