原始数据类型应该大写吗?
如果你要发明一种新语言,你认为原始数据类型应该大写吗,例如 Int
、Float
、Double
、String
与标准类命名约定保持一致?为什么或为什么不呢?
我所说的“原始”并不是说它们不能是(或表现得像)对象。我想我应该说“基本”数据类型。
If you were to invent a new language, do you think primitive datatypes should be capitalized, like Int
, Float
, Double
, String
to be consistent with standard class naming conventions? Why or why not?
By "primitive" I don't mean that they can't be (or behave like) objects. I guess I should have said "basic" datatypes.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果我要发明一种新语言,它不会有原始数据类型,只有包装对象。我已经在 Java 中完成了足够多的包装器到基元到包装器的转换,足以让我终生受益。
至于大小写?我会使用区分大小写的第一个字母大写,部分原因是这是我大脑中根深蒂固的惯例,部分原因是为了传达这样一个事实:嘿,这些也是对象。
If I were to invent a new language, it wouldn't have primitive data types, just wrapper objects. I've done enough wrapper-to-primitive-to-wrapper conversions in Java to last me the rest of my life.
As for capitalization? I'd go with case-sensitive first letter capitalized, partly because it's a convention that's ingrained in my brain, and partly to convey the fact that hey, these are objects too.
不区分大小写会导致一些疯狂的国际化问题;想想元音变音、波形符等。它使编译器变得更难,并允许程序员自由,但不会产生更好的代码。说真的,你认为对于 C 语言中大括号的位置有足够多的争论......只需观看即可。
至于基元看起来像类......只有当你可以子类化基元时。不要假设每个人都将类名大写; C++ 标准库没有。
就我个人而言,我想要一种具有两种整数类型的语言:
int
:平台上最快的任何整数类型,以及int(bits)
:具有给定位数的整数。您可以从中
typedef
获取您需要的任何内容。然后也许我可以获得一个固定(w,f)类型(分别是小数点左侧和右侧的位数)和浮动(m,e)类型。uint
和ufixed
表示无符号。 (任何想要无符号浮点数的人都可以请求。)并标准化位字段如何打包到结构中。如果编译器无法处理特定数量的位,它应该这样说并中止。为什么,是的,我对嵌入式系统进行编程,并且厌倦了
int
和long
每隔几年改变大小,你怎么知道? ^_-Case insensitivity leads to some crazy internationalization stuff; think umlauts, tildes, etc. It makes the compiler harder and allows the programmer freedoms that don't result in better code. Seriously, you think there's enough arguments over where to put braces in C... just watch.
As far as primitives looking like classes... only if you can subclass primitives. Don't assume everyone capitalizes class names; the C++ standard libraries do not.
Personally, I'd like a language that has, for example, two integer types:
int
: Whatever integer type is fastest on the platform, andint(bits)
: An integer with the given number of bits.You can
typedef
whatever you need from that. Then maybe I could get afixed(w,f)
type (number of bits to left and right of decimal, respectively) and afloat(m,e)
. Anduint
andufixed
for unsigned. (Anyone who wants an unsigned float can beg.) And standardize how bit fields are packed into structures. If the compiler can't handle a particular number of bits, it should say so and abort.Why, yes, I program embedded systems and got sick of
int
andlong
changing size every couple years, how could you tell? ^_-(警告:大量帖子。如果您想要我对这个问题的最终答案,请跳到底部,我会在那里回答它。如果您这样做,并且您认为我在胡言乱语,请在尝试之前阅读其余部分与我的“公牛争论。”)
如果我要制作一种编程语言,这里有一些警告:
如果这些(相当重要的)设计决策中的任何一个不适用于您的理想语言(而且很可能不适用于您的理想语言),那么我的以下(显然有争议的)决策将不适合您。如果你不是我,它可能也不适合你。我认为它适合我的语言,因为它是我的语言。您应该考虑您的语言以及您希望您的语言如何,这样您就可以像 Dennis Ritchie 或 Guido van Rossum 或 Larry Wall 一样长大后做出糟糕的设计并在事后回顾时用充分的论据来捍卫这些决定。
现在,我仍然认为,在我的语言中,标识符将不区分大小写,这将包括变量、函数(这将是变量)、类型(这也将是变量,都是内置的/原始的(这将是可以子类化)和用户定义),你能想到的。
解决出现的问题:
命名一致性是我见过的最好的论点,但我不同意。首先,允许两种不同的类型称为
int
和Int
是荒谬的。 Java 有int
和Integer
的事实几乎与它们都不允许任意精度的事实一样荒谬。 (免责声明:我最近非常喜欢“荒谬”这个词。)通常情况下,我会喜欢允许人们用两个不同的对象(称为
int
和Int
如果他们愿意的话,但这是一个懒惰的问题,以及旧的多字变量名参数的问题。我个人对
underscore_case
与MixedCase
与camelCase
问题的看法是,如果可能的话,它们都很丑陋且可读性较差你应该只使用一个词。在理想的情况下,所有代码都应该以商定的格式(大多数团队使用的风格)存储在源代码管理中,并且团队的反对者应该在他们的 VCS 中有钩子,将所有签出的代码从该风格转换为他们的风格,反之亦然,但我们并不生活在那个世界。由于某种原因,当我必须不断地编写
MixedCaseVariableOrClassNames
时,它比编写underscore_separated_variable_or_class_names
更让我烦恼。即使TimeOfDay
和time_of_day
也可能是相同的标识符,因为它们在概念上是相同的东西,但我有点犹豫是否要实现这一飞跃,如果只是因为这是一个不寻常的规则(变量名称中的内部下划线被删除)。一方面,它可以结束两种风格之间的争论,但另一方面,它也可能会惹恼人们。因此,我的最终决定基于两个部分,这两个部分都是非常主观的:
sizedint
给我的印象并不比sized_int
或SizedInt
(就驼峰命名法的例子而言,看起来特别糟糕)因为dI
恕我直言),所以我会同意。如果你喜欢驼峰命名法(很多人都喜欢),你可以使用它。如果您喜欢下划线,那么您就不走运了,但如果您确实需要,可以编写sized_int = sizedint
并继续生活。sized_int
,我可以接受。如果他们编写它并使用SizedInt
,我就不必坚持使用他们烦人的驼峰式命名法,并且在我的代码中,可以自由地将其编写为sizedint
。说一致性可以帮助我们记住事物的含义是愚蠢的。你说英语还是英语?两者都是,因为它们是同一个词,并且您将它们识别为同一个词。我认为 ee cummings 说对了,我们可能不应该这样做完全不同的情况,但我不能随心所欲地重写大多数人类和计算机语言。我所能做的就是说:“既然案件说的是同样的事情,你为什么还要对案件大惊小怪呢?”并用我自己的语言贯彻这种态度。
函数中的一次性变量(即
Person person = /* some */
)是一个很好的论点,但我不同意人们会这样做Person thePerson
(或一个人一个人
)。无论如何,我个人倾向于只做Person p
。首先,我不太喜欢大写类型名称(或任何其他东西),如果一次性变量足以将其非描述性地声明为
Person person
,那么您就不会失去Person p
提供了很多信息。任何说“非描述性的单字母变量名不好”的人也不应该使用非描述性的多字母变量名,例如Person person
。变量应该遵循合理的作用域规则(如 C 和 Perl,与 Python 不同 - 火焰战争从这里开始!),因此本地使用的简单名称(如
p
)永远不会出现冲突。至于如果您使用两个具有相同名称(仅大小写不同)的变量来实现 barf,这是一个好主意,但不是。如果有人创建了定义类型
XMLparser
的库 X,而其他人创建了定义类型XMLParser
的库 Y,并且我想编写一个为以下对象提供相同接口的抽象层:许多 XML 解析器包括这两种类型,我很骨感。即使有了命名空间,这仍然变得非常烦人。国际化问题已被提出。在我的解释器/编译器(可能是前者)中区分大写和小写变音 U 并不比在我的源代码中容易。
如果一种语言有字符串类型(即该语言不是 C)并且该字符串类型支持 Unicode(即该语言不是 Ruby - 这只是一个笑话,别钉死我),那么该语言已经提供了一种方法将 Unicode 字符串与小写转换,例如 Perl 的 lc() 函数(有时)和 Python 的 unicode.lower() 方法。该函数必须内置于语言中的某个位置并且可以处理 Unicode。
在解释器的编译时而不是运行时调用此函数很简单。对于编译器来说,这只是稍微困难一些,因为无论如何您仍然必须实现这种功能,因此将其包含在编译器中并不比将其包含在运行时库中更难。如果您用语言本身编写编译器(您应该这样做),并且功能内置于语言中,那么您将不会遇到任何问题。
回答你的问题,不。我认为我们不应该利用任何东西,就这样。打字(对我来说)很烦人,允许大小写差异会在大写和小写的事物、驼峰式和下划线的事物、或其他语义上不同但概念上相同的事物之间产生(或允许)不必要的混淆。如果这种区别完全是语义上的,那么我们根本不用理会它。
(Warning: MASSIVE post. If you want my final answer to this question, skip to the bottom section, where I answer it. If you do, and you think I'm spouting a load of bull, please read the rest before trying to argue with my "bull.")
If I were to make a programming language, here are a few caveats:
If any of those (rather important) design decisions don't apply to your ideal language (and they may very well not), then my following (apparently controversial) decision won't work for you. If you're not me, it may not work for you either. I think it fits my language, because it's my language. You should think about your language and how you want your language to be so that you, like Dennis Ritchie or Guido van Rossum or Larry Wall, can grow up to make bad design decisions and defend them in retrospect with good arguments.
Now then, I would still maintain that, in my language, identifiers would be case insensitive, and this would include variables, functions (which would be variables), types (which would also be variables, both built-in/primitive (which would be subclass-able) and user-defined), you name it.
To address issues as they come:
Naming consistency is the best argument I've seen, but I disagree. First off, allowing two different types called
int
andInt
is ridiculous. The fact that Java hasint
andInteger
is almost as ridiculous as the fact that neither of them allow arbitrary-precision. (Disclaimer: I've become a big fan of the word "ridiculous" lately.)Normally I would be a fan of allowing people to shoot themselves in the foot with things like two different objects called
int
andInt
if they want to, but here it's an issue of laziness, and of the old multiple-word-variable-name argument.My personal take on the issue of
underscore_case
vs.MixedCase
vs.camelCase
is that they're both ugly and less readable and if at all possible you should only use a single word. In an ideal world, all code should be stored in your source control in an agreed-upon format (the style that most of the team uses) and the team's dissenters should have hooks in their VCS to convert all checked out code from that style to their style and vice versa for checking back in, but we don't live in that world.It bothers me for some reason when I have to continually write
MixedCaseVariableOrClassNames
a lot more than it bothers me to writeunderscore_separated_variable_or_class_names
. EvenTimeOfDay
andtime_of_day
might be the same identifier because they're conceptually the same thing, but I'm a bit hesitant to make that leap, if only because it's an unusual rule (internal underscores are removed in variable names). On one hand, it could end the debate between the two styles, but on the other hand it could just annoy people.So my final decision is based on two parts, which are both highly subjective:
sizedint
doesn't strike me as much better or worse thansized_int
orSizedInt
(which, as far as examples of camelCase go, looks particularly bad because of thedI
IMHO), so I'd go with that. If you like camelCase (and many people do), you can use it. If you like underscores, you're out of luck, but if you really need to you can writesized_int = sizedint
and go on with life.sized_int
, I can live with that. If they wrote it and usedSizedInt
, I don't have to stick with their annoying-to-type camelCase and, in my code, can freely write it assizedint
.Saying that consistency helps us remember what things mean is silly. Do you speak english or English? Both, because they're the same word, and you recognize them as the same word. I think e.e. cummings was on to something, and we probably shouldn't have different cases at all, but I can't exactly rewrite most human and computer languages out there on a whim. All I can do is say, "Why are you making such a fuss about case when it says the same thing either way?" and implement this attitude in my own language.
Throwaway variables in functions (i.e.
Person person = /* something */
) is a pretty good argument, but I disagree that people would doPerson thePerson
(orPerson aPerson
). I personally tend to just doPerson p
anyway.I'm not much fond of capitalizing type names (or much of anything) in the first place, and if it's enough of a throwaway variable to declare it undescriptively as
Person person
, then you won't lose much information withPerson p
. And anyone who says "non-descriptive one-letter variable names are bad" shouldn't be using non-descriptive many-letter variable names either, likePerson person
.Variables should follow sane scoping rules (like C and Perl, unlike Python - flame war starts here guys!), so conflicts in simple names used locally (like
p
) should never arise.As to making the implementation barf if you use two variables with the same names differing only in case, that's a good idea, but no. If someone makes library X that defines the type
XMLparser
and someone else makes library Y that defines the typeXMLParser
, and I want to write an abstraction layer that provides the same interface for many XML parsers including the two types, I'm pretty boned. Even with namespaces, this still becomes prohibitively annoying to pull off.Internationalization issues have been brought up. Distinguishing between capital and lowercase umlautted U's will be no easier in my interpreter/compiler (probably the former) than in my source code.
If a language has a string type (i.e. the language isn't C) and the string type supports Unicode (i.e. the language isn't Ruby - it's only a joke, don't crucify me), then the language already provides a way to convert Unicode strings to and from lowercase, like Perl's
lc()
function (sometimes) and Python'sunicode.lower()
method. This function must be built into the language somewhere and can handle Unicode.Calling this function during an interpreter's compile-time rather than its runtime is simple. For a compiler it's only marginally harder, because you'll still have to implement this kind of functionality anyway, so including it in the compiler is no harder than including it in the runtime library. If you're writing the compiler in the language itself (and you should be), and the functionality is built into the language, you'll have no problems.
To answer your question, no. I don't think we should be capitalizing anything, period. It's annoying to type (to me) and allowing case differences creates (or allows) unnecessary confusion between capitalized and lowercased things, or camelCased and under_scored things, or other sets of semantically-distinct-but-conceptually-identical things. If the distinction is entirely semantic, let's not bother with it at all.