语言中区分大小写的目的是什么?
我一直想知道的是为什么语言设计为区分大小写?
我的豌豆大脑无法理解它有任何帮助的可能原因。
但我确信那里有一个。在有人说之前,有一个名为 dog
和 Dog
的变量通过区分大小写来区分确实是非常糟糕的做法,对吧?
感谢任何评论,也许还有关于此事的任何历史!我通常对区分大小写不敏感,但对区分大小写的敏感性很敏感,所以让我们保持所有答案和评论的文明!
Possible Duplicates:
Is there any advantage of being a case-sensitive programming language?
Why are many languages case sensitive?
Something I have always wondered, is why are languages designed to be case sensitive?
My pea brain can't fathom any possible reason why it is helpful.
But I'm sure there is one out there. And before anyone says it, having a variable called dog
and Dog
differentiated by case sensitivity is really really bad practise, right?
Any comments appreciated, along with perhaps any history on the matter! I'm insensitive about case sensitivity generally, but sensitive about sensitivity around case sensitivity so let's keep all answers and comments civil!
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(8)
在支持它的语言中,拥有两个仅根据大小写进行区分的成员并不一定是不好的做法。例如,以下是 C# 中相当常见的一点:就
我个人而言,我对区分大小写非常满意 - 特别是因为它允许像上面这样的代码,其中成员变量和属性无论如何都遵循约定,从而避免混淆。
请注意,区分大小写也有文化方面...并非所有文化都会认为相同的字符是等效的...
It's not necessarily bad practice to have two members which are only differentiated by case, in languages which support it. For example, here's a fairly common bit of C#:
Personally I'm quite happy with case sensitivity - particularly as it allows code like the above, where the member variable and property follow conventions anyway, avoiding confusion.
Note that case-sensitivity has a culture aspect too... not all cultures will deem the same characters to be equivalent...
编程语言区分大小写的最大原因之一是可读性。含义相同的事物看起来也应该相同。
我在相关讨论中发现了 M. Sandin 的以下有趣的示例:
你能立即看出问题吗?我不能...
One of the biggest reasons for case-sensitivity in programming languages is readability. Things that mean the same should also look the same.
I found the following interesting example by M. Sandin in a related discussion:
Can you see the problem immediately? I couldn't...
我喜欢区分大小写,以便区分类和实例。
Form form = new Form();
如果您不能这样做,您最终会得到名为
myForm
或form1
或的变量f
,它不像普通的旧form
那样干净和具有描述性。区分大小写还意味着您没有对
form
、FORM
和Form
的引用,它们的含义相同。我发现阅读这样的代码很困难。我发现扫描代码要容易得多,其中对同一变量的所有引用看起来都完全相同。I like case sensitivity in order to differentiate between class and instance.
Form form = new Form();
If you can't do that, you end up with variables called
myForm
orform1
orf
, which are not as clean and descriptive as plain oldform
.Case sensitivity also means that you don't have references to
form
,FORM
andForm
which all mean the same thing. I find it difficult to read such code. I find it much easier to scan code where all references to the same variable look exactly the same.最终,这是因为更容易正确实现区分大小写的比较;您只需比较字节/字符而不进行任何转换。您还可以非常轻松地执行其他操作,例如散列。
为什么这是一个问题?好吧,除非您处于受支持字符(特别是 US-ASCII)的很小范围内,否则很难添加不区分大小写。大小写转换规则因区域设置而异(土耳其规则与世界其他地方的规则不同),并且不能保证翻转单个位会做正确的事情,或者它始终是相同的位并且在相同的情况下前提条件。 (IIRC,某些语言中有一些非常复杂的规则,用于在将元音转换为大写时丢弃变音符号,并在转换为小写时重新引入它们。我完全忘记了细节是什么。)
如果您区分大小写,则只需忽略所有这些;只是更简单了。 (请注意,您仍然应该注意 UNICODE 规范化形式,但这是另一个故事,它适用于您使用的任何大小写规则。)
Ultimately, it's because it is easier to correctly implement a case-sensitive comparison correctly; you just compare bytes/characters without any conversions. You can also do other things like hashing really easy.
Why is this an issue? Well, case-insensitivity is rather hard to add unless you're in a tiny domain of supported characters (notably, US-ASCII). Case conversion rules vary by locale (the Turkish rules are not the same as those in the rest of the world) and there's no guarantee that flipping a single bit will do the right thing, or that it is always the same bit and under the same preconditions. (IIRC, there's some really complex rules in some language for throwing away diacritics when converting vowels to upper case, and reintroducing them when converting to lower case. I forget exactly what the details are.)
If you're case sensitive, you just ignore all that; it's just simpler. (Mind you, you still ought to pay attention to UNICODE normalization forms, but that's another story and it applies whatever case rules you're using.)
想象一下,您有一个名为
dog
的对象,它有一个名为Bark()
的方法。您还定义了一个名为 Dog 的类,它有一个名为Bark()
的静态方法。您编写dog.Bark()
。那么它会做什么呢?调用对象的方法还是类的静态方法? (在::
不存在的语言中)Imagine you have an object called
dog
, which has a method calledBark()
. Also you have defined a class called Dog, which has a static method calledBark()
. You writedog.Bark()
. So what's it going to do? Call the object's method or the static method from the class? (in a language where::
doesn't exist)您无法理解为什么区分大小写是一个好主意,因为它不是一个好主意。这只是 C 的奇怪怪癖之一(例如基于 0 的数组),现在看起来“正常”,因为很多语言都复制了 C 所做的事情。
C 在标识符中使用区分大小写,但从语言设计的角度来看,这是一个奇怪的选择。大多数从头开始设计的语言(没有考虑到以任何方式“像 C”)都是不区分大小写的。这包括 Fortran、Cobol、Lisp 和几乎整个 Algol 语言家族(Pascal、Modula-2、Oberon、Ada 等)。
脚本语言是一个混合体。许多文件都区分大小写,因为 Unix 文件系统区分大小写,并且它们必须与其进行明智的交互。 C 是在 Unix 环境中有机发展起来的,并且可能从那里继承了区分大小写的哲学。
The reason you can't understand why case-sensitivity is a good idea, is because it is not. It is just one of the weird quirks of C (like 0-based arrays) that now seem "normal" because so many languages copied what C did.
C uses case-sensitivity in indentifiers, but from a language design perspective that was a weird choice. Most languages that were designed from scratch (with no consideration given to being "like C" in any way) were made case-insensitive. This includes Fortran, Cobol, Lisp, and almost the entire Algol family of languages (Pascal, Modula-2, Oberon, Ada, etc.)
Scripting languages are a mixed bag. Many were made case-sensitive because the Unix filesystem was case-sensitive and they had to interact sensibly with it. C kind of grew up organically in the Unix environment, and probably picked up the case-sensitive philosophy from there.
我确信最初这是出于性能考虑。将字符串转换为大写或小写以进行无大小写比较并不是一项昂贵的操作,但它也不是免费的,并且在旧系统上,它可能会增加当时的系统无法处理的复杂性。
当然,现在语言喜欢彼此兼容(例如 VB 无法区分仅大小写不同的 C# 类或函数),人们习惯于用相同的文本命名事物,但大小写不同(参见 Jon Skeet 的回答 - 我经常这样做),并且无大小写语言的价值并不足以真正超过这两种语言。
I'm sure originally it was a performance consideration. Converting a string to upper or lower case for caseless comparison isn't an expensive operation exactly, but it's not free either, and on old systems it may have added complexity that the systems of the day weren't ready to handle.
And now, of course, languages like to be compatible with each other (VB for example can't distinguish between C# classes or functions that differ only in case), people are used to naming things the same text but with different cases (See Jon Skeet's answer - I do that a lot), and the value of caseless languages wasn't really enough to outweigh these two.
区分大小写的比较(从忽略规范等价的天真的角度来看)是微不足道的(简单地比较代码点),但不区分大小写的比较没有很好的定义,并且在所有情况下都极其复杂,并且规则不可能记住。实现它是可能的,但会无意中导致意想不到的和令人惊讶的行为。顺便说一句,某些语言(例如 Fortran 和 Basic)始终不区分大小写。
Case-sensitive comparison is (from a naive point of view that ignores canonical equivalence) trivial (simply compare code points), but case-insensitive comparison is not well defined and extremely complex in all cases, and the rules are impossible to remember. Implementing it is possible, but will inadvertedly lead to unexpected and surprising behavior. BTW, some languages like Fortran and Basic have always been case-insensitive.