为什么 .net 中的字符串默认区分大小写?
大多数时候我想做字符串比较,我希望它们不区分大小写。
那么为什么 .net 中的字符串默认区分大小写呢?
编辑1:为了清楚起见,我认为下面的内容应该默认返回true。或者至少允许我有一个编译时标志来实现这一点。
"John Smith" == "JOHN SMITH"
编辑2:我可以想到更多应该不区分大小写的事物的示例
应该不区分大小写的事物的示例
- 用户名
- Url
- 文件扩展名/文件名/目录名/路径
- 机器/服务器名称
- 州/国家/地区/ 位置等
- 名字 / 姓氏 / 首字母缩写
- 指导
- 月/日名称
应区分大小写的示例
- 密码
Most times I want to do string comparisons I want them to be case insensitive.
So why are string in .net case sensitive by default?
EDIT 1: To be clear I think the below should return true by default. Or at least allow me to have a compile time flag that makes it so.
"John Smith" == "JOHN SMITH"
EDIT 2: I can think of many more examples of things that should be case insensitive
Examples of things that should be case insensitive
- Usernames
- Urls
- File extensions / File names / Directory names / Paths
- Machine / servernames
- State / Country / Location etc
- FirstName / LastName / Initials
- Guids
- Month / Day names
Examples of things that should be case sensitive
- Passwords
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
很抱歉这个简单的答案,但这就是它的方式:)
在基本级别上,字符串表示为字符列表,其中“a”与“A”不同,因此它可能是最简单的表示\约定。就您而言,可以公平地说,大多数比较是不区分大小写的,但我认为论证的另一面至少同样成立,并且已经采用了约定。
我想利用一些辅助方法\类会在一定程度上减轻你的痛苦。
Sorry for the trivial answer, but that's just the way it is :)
At a basic level, strings are represented as a list of characters, where 'a' is different from 'A', so it's probably the easiest representation \ convention overall. In your case, it's probably fair to say that the majority of comparisons is case-insensitive, but I think the other side of the argument holds true at least as much and a convention has been adopted.
I'd imagine utilizing some helper methods \ classes would ease your pain somewhat.
因为不敏感匹配有很多种,而且不清楚你想要哪一种。以下是三种最常见的模式:
它们的用例截然不同。您可能没有注意到那么多,因为您每天都在处理 ASCII。其他地区的用户会看到更多差异。
Because there are different kinds of insensitive matching and it is unclear which one you want. Here are the three most common modes:
They have vastly different uses cases. You probably did not notice that much because you are dealing with ASCII day-to-day. Users in other regions see more differences.
因为不区分大小写并不高效,而且即使您不希望它这样做,它也会起作用。
供应商需要根据性能进行竞争,因此默认选项往往是性能最好的选项。最好的情况是,不区分大小写需要在比较之前将两个字符串折叠为通用大小写。在最坏的情况下,根据区域设置,它需要的代码路径可能是两倍长。如果供应商默认使用性能较低的版本,竞争对手将选择最坏的情况进行基准测试。
由于某些搜索无法区分大小写,因此您必须在代码中解决此问题。它迫使人们做出有意识的决定。相反,不区分大小写是有效的,即使在您不希望的情况下也是如此。它不会强迫你做出决定,而是会创造一个场景,让你可以忽视它,从而对自己造成损害。作为选择架构的问题,供应商倾向于选择缺陷较少的选项 - 在本例中就是区分大小写。
Because case insensitivity is not performant and because it works even when you intend it not to.
Vendors need to compete based on performance and for that reason the default option tends to be the one that performs best. At best, case insensitivity requires folding both strings to a common case prior to comparing. At worst, depending on locale, it requires a code path that can be twice as long. If the vendor defaulted to the less performant version, competitors would pick the worst-case scenarios to benchmark against.
Since case sensitivity fails on certain searches you are forced to address this in your code. It forces a conscious decision. In contrast, case insensitivity works, even in cases where you don't want it to. Rather than forcing you to make a decision it creates a scenario where you can overlook it to your detriment. As a matter of choice architecture, vendors tend to pick the option that leads to fewer defects - in this case that's case sensitivity.
.Net 中的字符串比较区分大小写,因为字符串(和单个字符)本质上是区分大小写的。
字符“a”在内部存储为与“A”不同的 ASCII 或 Unicode 值。说“a”与“A”相同并不“正确”。
当比较英语以外的语言的值、使用哈希表等算法或使用许多加密/解密算法时,这种区别变得至关重要。
我的两分钱:区分大小写的比较是默认的,因为它是正确的。
String comparison in .Net is case-sensitive because strings (and individual characters) are inherently case-sensitive.
The character 'a' is stored internally with a different ASCII or Unicode value as 'A'. Saying that 'a' is the same as 'A' is not "correct".
This distinction becomes critical when comparing values in languages other than English, when using algorithms like hash tables, or when using many encryption/decryption algorithms.
My two cents: case sensitive compare is the default because it is correct.
在 VB.NET 中,可以将“选项比较”设置为不区分大小写的文本,但我强烈建议不要这样做。我最喜欢的是当我需要不敏感地比较并读取文本的小写版本时使用 string.toLower() 方法。
为什么?因为当区分大小写很重要时,您会如何比较,就像在某些应用程序中一样?
In VB.NET it's possible to set the "option compare" to text to work case-insensitively but I highly discourage it. My favorite is just to use the string.toLower() method when I need to compare insensitively and read the lower case version of the text.
Why? Because how else would you compare when case sensitivity matters as it would in some applications?
您无法更改现有类的行为。 mscorelib/system.core 中定义的 System.String 类覆盖 == 并定义区分大小写的相等性。
您所能做的就是向字符串添加扩展方法并实现不区分大小写:
You cannot change the behaviour of existing classes.
System.String
class which is defined in mscorelib/system.core overrides == and defines a cases sensitive equality.All you can do is to add an extension method to the string and implement a case-insensitive:
您的情况不一定是最常见的情况,一种非常常见的情况是将文档中的单词与语法条件进行匹配,在这种情况下,区分大小写是绝对必须的。
以不区分大小写的方式进行注释匹配非常简单。事实上,字符串的 equals 方法有一个重载,专门用于指定如何比较。
your case is not necessarily the most common case, a very common case is to match words in a document against grammar conditions, in that case case sensitivity is an absolute must.
Note matching in a case in-sensitive fashion is trivially easy. In fact the equals method of a string has an overload specifically for specifying how to compare.
我知道这是死贴,但
我来这里是为了寻找同一问题的解决方案。现在已经快五年过去了……但我不介意,因为这是第一个搜索结果,我认为包含正确的信息会更好。
根据 此 MSDN 页面,您只需将 1 行代码添加到您的文件:
如果将以上行添加到核心的开头,您将告诉 CLR 从默认值(
Option Compare Binary
)切换到不区分大小写的比较。我不知道这是否可以在 C# 中工作。
I know this is necroposting, but
I came here searching for a solution to the same problem. Now it is almost 5 years later... but I don't mind, as this is one of the first search results and I think it would be better to include correct information.
According this MSDN page you simply need to add 1 line of code to your file:
If you add the above line to the beginning of your core you are telling the CLR to switch from default (
Option Compare Binary
) to the case-insensitive comparison.I don't know if this can work in C#.