null 的目的是什么?
我在编译器课程中,我们的任务是从头开始创建我们自己的语言。 目前我们的困境是是否包含“null”类型。 null 提供什么目的? 我们团队中的一些人认为这并不是绝对必要的,而另一些人则支持 null,只是因为它可以提供额外的灵活性。
您有什么想法,特别是支持或反对 null 的想法吗? 您是否曾经创建过需要 null 的功能?
I am in a compilers class and we are tasked with creating our own language, from scratch. Currently our dilemma is whether to include a 'null' type or not. What purpose does null provide? Some of our team is arguing that it is not strictly necessary, while others are pro-null just for the extra flexibility it can provide.
Do you have any thoughts, especially for or against null?
Have you ever created functionality that required null?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(25)
例如,考虑 C 和 Java 的例子。 在 C 中,约定空指针是数值零。 当然,这实际上只是一个约定:该语言没有任何内容将该值视为任何特殊的东西。 然而,在 Java 中,
null
是一个独特的概念,您可以检测到并知道,是的,这实际上是一个错误的引用,我不应该尝试打开那扇门看看另一扇门上有什么边。即便如此,我对空值的厌恶几乎超过了其他任何东西。
基于评论的澄清:我讨厌事实上的空指针值零,比我讨厌
null
更糟糕。每当我看到 null 赋值时,我都会想,“哦,太好了,有人刚刚在代码中埋了地雷。有一天,我们将沿着相关的执行路径走下去,然后轰隆!”空指针异常!”
我希望有人指定一个有用的默认值或 NullObject,让我知道“此参数尚未设置为任何有用的值”。 秃头零值本身就是等待发生的麻烦。
也就是说,它仍然比原始的零游荡要好。
Consider the examples of C and of Java, for example. In C, the convention is that a null pointer is the numeric value zero. Of course, that's really just a convention: nothing about the language treats that value as anything special. In Java, however,
null
is a distinct concept that you can detect and know that, yes, this is in fact a bad reference and I shouldn't try to open that door to see what's on the other side.Even so, I hate nulls almost worse than anything else.
CLARIFICATION based on comments: I hate the defacto null pointer value of zero worse than I hate
null
.Any time I see an assignment to null, I think, "oh good, someone has just put a landmine in the code. Someday, we're going to be walking down a related execution path and BOOM! NullPointerException!"
What I would prefer is for someone to specify a useful default or NullObject that lets me know that "this parameter has not been set to anything useful." A bald null by itself is just trouble waiting to happen.
That said, it's still better than a raw zero wandering around loose.
该决定取决于编程语言的目标。
您为谁设计编程语言? 您是否为熟悉 C 派生语言的人设计它? 如果是这样,那么您可能应该添加对 null 的支持。
总的来说,我想说你应该避免违背人们的期望,除非它有特定的目的。
以 C# 中的 switch-block 为例。 C# 中的所有 case 标签在每个分支中都必须有一个显式的控制流表达式。 也就是说,它们都必须以“break”语句或显式的 goto 结尾。 这意味着虽然这段代码是合法的:
这段代码不合法:
为了创建从情况 1 到情况 2 的“失败”,有必要插入一个 goto,如下所示:
这可能会违反正在学习 C# 的 C++ 程序员的期望。 然而,添加该限制是有目的的。 它消除了出现一整类常见 C++ 错误的可能性。 它稍微增加了语言的学习曲线,但结果对程序员来说是净收益。
如果您的目标是设计一种针对 C++ 程序员的语言,那么删除 null 可能会违反他们的期望。 这会造成混乱,并使你的语言更难学。 那么关键问题是“他们得到什么好处”? 或者,“这会造成什么损害”。
如果你只是想设计一种可以在一个学期的课程中实现的“超小语言”,那么故事就不同了。 在这种情况下,您的目标不是构建一种针对特定人群的有用语言。 相反,它只是学习如何创建编译器。 在这种情况下,拥有更小的语言是一个很大的好处,因此值得消除 null。
因此,回顾一下,我想说您应该:
通常这将使所需的结果变得非常清晰。
当然,如果你没有明确阐明你的设计目标,或者你不能就它们是什么达成一致,那么你仍然会争论。 然而,在这种情况下,无论如何,你都注定要失败。
That decision depends on the objective of the programing language.
Who are you designing the programing language for? Are you designing it for people who are familiar with c-derived languages? If so, then you should probably add support for null.
In general, I would say that you should avoid violating people's expectations unless it serves a particular purpose.
Take switch-blocks in C# as an example. All case labels in C# must have an explicit control-flow expression in every branch. That is they must all end with either a "break" statement or an explicit goto. That means that while this code is legal:
That this code would not be legal:
In order to create a "fall through" from case 1 to case 2, it's necessary to insert a goto, like this:
This is arguably something that would violate the expectations of C++ programmers who are leaning C#. However, adding that restriction serves a purpose. It eliminates the possibility of an entire class of common C++ bugs. It adds to the learning curve of the language slightly, but the result is a net benefit to the programmer.
If your goal is to design a language targeted at C++ programmers, then removing null would probably violate their expectations. That will cause confusion, and make your language more difficult to learn. The key question is then, "what benefit do they get"? Or, alternatively, "what detriment does this cause".
If you are simply trying to design a "super small language" that can be implemented in the course of a single semester, then the story is different. In that case your objective isn't to be build a useful language targeted at a particular segment of the population. Instead, it's just to learn how to create a compiler. In that scenario, having a smaller language is a big benefit, and so it's worth eliminating null.
So, to recap, I would say that you should:
Usually this will make the desired result pretty clear.
Of course, if you don't explicitly articulate your design goals, or you can't agree on what they are, then you are still going to argue. In that case, however, you are pretty much doomed anyways.
另一种看待 null 的方式是,它是一个性能问题。 如果您有一个包含其他复杂对象等的复杂对象,那么允许所有属性最初变为 null 比创建某种没有任何用处且很快就会被替换的空对象会更有效。
这只是我以前看不到提到的一种观点。
One other way to look at null is that it's a performance issue. If you have a complex object containing other complex objects and so on, then it is more efficient to allow for all properties to initially become null instead of creating some kind of empty objects that won't be good for nothing and soon to be replaced.
That's just one perspective that I can't see mentioned before.
我相信这里有两个 null 概念在起作用。
第一个(逻辑指示符为空)是传统的程序语言机制,它提供程序逻辑中未初始化内存引用的运行时指示。
第二个(空值)是一个基本数据值,可在逻辑表达式中使用,以检测逻辑空指示符(前面的定义)并在程序代码中做出逻辑决策。
虽然空值多年来一直是许多程序员的祸根和许多应用程序错误的根源,但空值概念仍然有效。 如果您和您的团队创建的语言使用的内存引用可能由于未初始化而可能被误用,那么您可能需要一种机制来检测这种可能性。 创建替代方案始终是一种选择,但 null 是一种广为人知的替代方案。
最重要的是,这一切都取决于您的语言的目标:
如果鲁棒性和程序正确性在您的优先级列表中很高,并且您允许编程内存引用,那么您将需要考虑 null。
BB
I believe there are two concepts of null at work here.
The first (null the logical indicator) is a conventional program language mechanism that provides runtime indication of a non-initialized memory reference in program logic.
The second (null the value) is a base data value that can be used in logical expressions to detect the logical null indicator (the previous definition) and make logical decisions in program code.
While null has been the bane of many programmers and the source of many application faults over the years, the null concept has validity. If you and your team create a language that uses memory references that can be potentially misused because the reference was not initialized, you will likely need a mechanism to detect that eventuality. It is always an option to create an alternative, but null is a widely known alternative.
Bottom line, it all depends upon the goals of your language:
If robustness and program correctness are high on your priority list AND you allow programmatic memory references, you will want to consider null.
BB
如果您正在创建静态类型语言,我想 null 可能会给您的编译器增加很多复杂性。
如果您正在创建动态类型语言,NULL 会非常方便,因为它只是另一种“类型”,没有任何变化。
If you are creating a statically typed language, I imagine that null could add a good deal of complexity to your compiler.
If you are creating a dynamically typed language, NULL can come in quite handy, as it is just another "type" without any variations.
Null 是一个占位符,意味着不能为该变量分配任何值(对于静态类型语言附加“正确类型”)。
这里存在认知失调。 我在其他地方听说人类无法理解否定,因为他们必须假设一个值,然后想象它的不适合性。
Null is a placeholder that means that no value (append "of the correct type" for a static-typed language) can be assigned to that variable.
There is cognitive dissonance here. I heard somewhere else that humans cannot comprehend negation, because they must suppose a value and then imagine its unfitness.
我对您的团队的建议是:提出一些需要用您的语言编写的示例程序,并看看如果您遗漏
null
和包含它,它们会是什么样子。My suggestion to your team is: come up with some examples programs that need to be written in your language, and see how they would look if you left out
null
, versus if you included it.使用空对象模式!
如果您的语言是面向对象的,请让它具有一个仅存在一个单例实例的
UndefinedValue
类。 然后在任何使用null
的地方都使用此实例。 这样做的优点是您的null
将响应诸如#toString
和#equals
之类的消息。 您永远不会像 Java 中那样遇到空指针异常。 (当然,这要求您的语言是动态类型的)。Use a null object pattern!
If you language is object oriented, let it have an
UndefinedValue
class of which only one singleton instance exists. Then use this instance wherevernull
is used. This has the advantage that yournull
will respond to messages such as#toString
and#equals
. You will never run into a null pointer exception as in Java. (Of course, this requires that your language is dynamically typed).Null 为那些尚未完全考虑其程序所需的逻辑和域的程序员提供了一种简单的出路,或者使用基本上没有明确且商定的定义的值的未来维护影响。
乍一看似乎很明显,它一定意味着“没有价值”,但实际上意味着什么取决于上下文。 例如,如果 LastName === null,这是否意味着该人没有姓氏,或者我们不知道他们的姓氏是什么,或者尚未输入系统? null 是否等于自身? 在 SQL 中则不然。 在许多语言中都是如此。 但是如果我们不知道 personA.lastName 或 personB.lastName 的值,我们怎么知道 personA.lastName === personB.lastName,嗯? 如果结果是假的,或者…… 无效的?
这取决于你在做什么,这就是为什么拥有某种系统范围的值是危险和愚蠢的,它可以用于任何看起来像“无”的情况,因为你的程序的其他部分和不能真正依赖外部库或模块来正确解释“null”的含义。
你最好清楚地定义lastName可能值的DOMAIN,以及每个可能值的实际含义,而不是依赖于一些模糊的系统范围的null概念,这可能与你正在做的事情有任何相关性,也可能没有任何相关性,取决于您使用的语言以及您想要执行的操作。 当您开始操作数据时,一个值实际上可能以完全错误的方式运行。
Null provides an easy way out for programmers who haven't completely thought through the logic and domains needed by their program, or the future maintenance implications of using a value with essentially no clear and agreed upon definition.
It may seem obvious at first that it must mean "no value", but what that ACTUALLY means depends on context. If, for instance LastName === null, does that mean that person doesn't have a last name, or that we don't know what their last name is, or that it hasn't be entered into the system yet? Does null equal itself, or doesn't it? In SQL it does not. In many languages it does. But if we don't know the value of personA.lastName, or personB.lastName, how can we know that personA.lastName === personB.lastName, eh? Should the result be false, or .. . null?
It depends on what you're doing, which is why it's dangerous and silly to have some kind of system wide value that can be used for any kind of situation that kind of looks like "nothing", since how other parts of your program and external libraries or modules can't really be depended upon to correctly interpret what you meant by "null".
You're much better off clearly defining the DOMAIN of possible values of lastName, and exactly what every possible value actually means, rather than depending on some vague systemwide notion of null, which may or may not have any relevance to what you're doing, depending on which language you're using, and what you're trying to do. A value, which may in fact, behave in exactly the wrong way when you begin to operate on your data.
Null 对于对象就像 0 对于数字一样。
Null is to objects what 0 is to numbers.
空:十亿美元的错误。 托尼·霍尔:
Null: The Billion Dollar Mistake. Tony Hoare:
null
是一个哨兵值,它不是整数,不是字符串,也不是布尔值 - 实际上不是任何东西,除了要保存的东西并且是“不存在”值。 不要将其视为或期望它是 0、空字符串或空列表。 这些都是有效值,并且在许多情况下都可以是真正有效的值 - null 的想法意味着那里没有值。也许这有点像函数抛出异常而不是返回值。 只不过它不是制造并返回具有特殊含义的普通值,而是返回已经具有特殊含义的特殊值。 如果一种语言期望您使用
null
,那么您就不能真正忽略它。null
is a sentinel value that is not an integer, not a string, not a boolean - not anything really, except something to hold and be a "not there" value. Don't treat it as or expect it to be a 0, or an empty string or an empty list. Those are all valid values and can be geniunely valid values in many circumstances - the idea of a null instead means there is no value there.Perhaps it's a little bit like a function throwing an exception instead of returning a value. Except instead of manufacturing and returning an ordinary value with a special meaning, it returns a special value that already has a special meaning. If a language expects you to work with
null
, then you can't really ignore it.哦不,我感觉哲学专业从我身上冒出来了……
NULL的概念来自于集合论中空集的概念。 几乎每个人都同意空集不等于零。 几十年来,数学家和哲学家一直在争论集合论的价值。
在编程语言中,我认为理解不引用内存中任何内容的对象引用非常有帮助。 谷歌一下集合论,你会发现集合论学家使用的形式符号系统(符号)和我们在许多计算机语言中使用的符号之间的相似之处。
问候,
山姆
Oh no, I feel the philosophy major coming out of me....
The notion of NULL comes from the notion of the empty set in set theory. Nearly everyone agrees that the empty set is not equal to zero. Mathematicians and philosophers have been battling about the value of set theory for decades.
In programming languages, I think it is very helpful to understand object references that do not refer to anything in memory. Google about set theory and you will see similarities between the formal symbolic systems (notation) that set theorists use and symbols we use in many computer languages.
Regards,
Sam
你问什么是空?
好吧,
没什么。
What's null for you ask?
Well,
Nothing.
我通常在 C/C++ 方面将“null”视为“内存地址 0”。 它不是严格需要的,但如果它不存在,那么人们只会使用其他东西(如果 myNumber == -1,或者如果 myString == "")。
我所知道的是,我想不出我在编码中哪一天没有输入“null”这个词,所以我认为这非常重要。
在 .NET 世界中,MS 最近为 int、long 等添加了可以为空的类型,而这些类型以前从未为空,所以我猜他们认为这也非常重要。
如果我要设计一种语言,我会保留它。 但是我也不会避免使用没有 null 的语言。 也需要一点时间来适应。
I usually think of 'null' in the C/C++ aspect of 'memory address 0'. It's not strictly needed, but if it didn't exist, then people would just use something else (if myNumber == -1, or if myString == "").
All I know is, I can't think of a day I've spent coding that I haven't typed the word "null", so I think that makes it pretty important.
In the .NET world, MS recently added nullable types for int, long, etc that never used to be nullable, so I guess they think its pretty important too.
If I was designing a lanaguage, I would keep it. However I wouldnt avoid using a language that didn't have null either. It would just take a little getting used too.
零的概念不是严格必要的,就像零概念不是严格必要的一样。
the concept of null is not strictly necessary in exactly the same sense that the concept of zero is not strictly necessary.
我认为在整个语言设计的背景之外讨论 null 没有什么帮助。 第一个困惑点:null 类型是空的,还是包含单个可区分的值(通常称为“nil”)? 完全空的类型并不是很有用——尽管 C 使用空返回类型
void
来标记仅为了副作用而执行的过程,但许多其他语言使用单例类型(通常是空元组) ) 以此目的。我发现 nil 值在动态类型语言中使用最有效。 在 Smalltalk 中,当您需要值但没有任何信息时使用该值。 在 Lua 中,它的使用更加有效:nil 值是 Lua 表中唯一不能作为键或值的值。 在Lua中,nil也被用作缺失参数或结果的值。
总的来说,我想说 nil value 在动态类型设置中很有用,但在静态类型设置中,null type 仅用于讨论函数(或为产生副作用而执行的过程或方法)。
不惜一切代价,避免在 C 和 Java 中使用
NULL
指针。 这些是指针和对象的实现中固有的产物,在设计良好的语言中,它们不应该被允许。 无论如何,要为您的用户提供一种方法来扩展具有空值的现有类型,但要让他们有目的地明确地执行此操作,不要意外地强制每种类型都有一个类型。 (作为显式使用的一个示例,我最近在 Haskell 中实现了 Bentley 和 Sedgewick 的三元搜索树,并且我需要使用一个表示“不是字符”的附加值来扩展字符类型。为此,Haskell 提供了Maybe< /code> 类型。)
最后,如果您正在编写编译器,最好记住该语言中最容易编译的部分以及导致最少错误的部分是不存在的部分:-)
I don't think it's helpful to talk about null outside the context of the whole language design. First point of confusion: is the null type empty, or does it include a single, distinguished value (often called "nil")? A completely empty type is not very useful---although C uses the empty return type
void
to mark a procedure that is executed only for side effect, many other languages use a singleton type (usually the empty tuple) for this purpose.I find that a nil value is used most effectively in dynamically typed languages. In Smalltalk it is the value used when you need a value but you don't have any information. In Lua it is used even more effectively: the nil value is the only value that cannot be a key or a value in a Lua table. In Lua, nil is also used as the value of missing parameters or results.
Overall I would say that a nil value can be useful in a dynamically typed setting, but in a statically typed setting, a null type is useful only for talking about functions (or procedures or methods) that are executed for side effect.
At all costs, avoid the
NULL
pointer used in C and Java. These are artifacts inherent in the implementations of pointers and objects, and in a well designed lanugage they should not be allowed. By all means give your users a way to extend an existing type with a null value, but make them do it explicitly, on purpose---don't force every type to have one by accident. (As an example of explicit use, I recently implemented Bentley and Sedgewick's ternary search trees in Haskell, and I needed to extend the character type with one additional value meaning 'not a character'. For this purpose Haskell provides theMaybe
type.)Finally, if you are writing a compiler, it is good to remember that the easiest parts of the language to compile, and the parts that cause the fewest bugs, are the parts that aren't there :-)
有一种方法来指示当前未指向任何东西的引用或指针似乎很有用,无论您将其称为 null、nil、None 等。如果没有其他原因让人们知道他们何时即将倒下离开链表的末尾。
It seems useful to have a way to indicate a reference or pointer that isn't currently pointing at anything, whether you call it null, nil, None, etc. If for no other reason to let people know when they're about to fall off the end of a linked list.
在C中NULL是(void*(0)),所以它是一个带有值(?)的类型。 但这不适用于 C++ 模板,因此 C++ 将 NULL 设为 0,删除了类型并变成了纯值。
然而,人们发现拥有特定的 NULL 类型会更好,因此他们(C++ 委员会)决定 NULL 将再次成为一种类型(在 C++0x 中)。
此外,除了 C++ 之外,几乎所有语言都将 NULL 作为类型,或者与 0 不同的等效唯一值(它可能等于或不等于,但不是同一个值)。
所以现在连 C++ 都会使用 NULL 作为类型,基本上结束了对这个问题的讨论,因为现在每个人(几乎)都会有一个 NULL 类型
编辑: 想想 Haskell 的也许是 NULL 类型的另一种解决方案,但它并不容易掌握或实施。
In C NULL was (void*(0)), so it was a type with value(?). But that didn't work with C++ templates so C++ made NULL 0, it dropped the type and became a pure value.
However it was found that having a specific NULL type would be better so they (the C++ committee) decided that NULL will once again become a type (in C++0x).
Also almost every language besides C++ has NULL as a type, or an equivalent unique value not the same as 0 (it might be equal to it or not, but its not the same value).
So now even C++ will use NULL as a type, basically closing the discussions on the matter, since now everyone (almost) will have a NULL type
Edit: Thinking about it Haskell's maybe is another solution to NULL types, but its not as easy to grasp or implement.
null 的一个实际例子是当您提出是/否问题但没有得到答复时。 您不想默认为“否”,因为在答案非常重要的情况下知道问题没有得到回答可能很重要。
A practical example of null is when you ask a yes/no question and don't get a response. You don't want to default to no because it might be important to know that the question wasn't answered in situations where the answer is very important.
空不是一个错误。
Null 意味着“我还不知道”
对于基元,你实际上并不需要 null (我不得不说,字符串(在 .NET 中)不应该得到它,恕我直言),
但对于复合实体来说,它肯定有其用途。
Null is not a mistake.
Null means "I don't know yet"
For primitives you don't really need a null (I have to say that strings (in .NET) shouldn't get it IMHO)
But for composite entities it definitely serves a purpose.
您可以将任何类型视为带有操作集合的集合。 在很多情况下,使用一个不是“正常”值的值是很方便的; 例如,考虑“EOF”值。 对于 C 的
getline()
。 您可以通过以下几种方式之一来处理该问题:您可以在集合之外拥有 NULL 值,您可以将特定值区分为 null(在 C 中,((void *)0)
可以达到此目的) 或者你可以有一种创建新类型的方法,这样对于类型 T,你可以创建一个类型 T' =def { T ∪ NULL }< /em>,这就是 Haskell 的做法(“也许”类型)。哪一个更好有利于许多有趣的争论。
You can think of any type as a set along with a collection of operations. There are many cases where it's convenient to have a value with isn't a "normal" value; for example, consider an "EOF" value. for C's
getline()
. You can handle that in one of several ways: you can have a NULL value outside the set, you can distinguish a particular value as null (in C,((void *)0)
can serve that purpose) or you can have a way of creating a new type, so that for type T, you create a type T' =def { T ∪ NULL }, which is the way Haskell does it (a "Maybe" type).Which one is better is good for lots of enjoyable argument.
Null 仅在变量未赋值的情况下才有用。 如果每个变量都有值,则不需要空值。
Null is only useful in situations where there are variables with unassigned values. If every variable has a value, then there is no need for null values.
Null 是一个哨兵值。 它的值不可能是真实数据,而是提供有关正在使用的变量的元数据。
分配给指针的 Null 表示该指针未初始化。 这使您能够通过检测空值指针的取消引用来检测未初始化指针的滥用。 如果您将指针的值保留为等于内存中发生的任何值,那么您将获得非常不规则的程序行为,这将更加难以调试。
此外,C 风格的可变长度字符串中的空字符用于标记字符串的结尾。
以这些方式使用 null,特别是对于指针值,已经变得非常流行,以至于该隐喻已被导入到其他系统中,即使“null”标记值的实现方式完全不同并且与数字 0 无关。
Null is a sentinel value. It's a value that cannot possibly be real data and instead provides meta-data about the variable in use.
Null assigned to a pointer indicates that the pointer is uninitialized. This gives you the ability to detect misuse of uninitialized pointers by detecting dereferences of null valued pointers. If you instead leave the value of a pointer equal to whatever happened to be in memory then you would have crazily irregular program behavior that would be much more difficult to debug.
Also, the null character in a C-style variable length string is used to mark the end of the string.
The use of null in these ways, especially for pointer values, has become so popular that the metaphor has been imported into other systems, even when the "null" sentinel value is implemented entirely differently and has no relation to the number 0.
Null 不是问题——每个人对 null 的处理和解释都不同,这才是问题所在。
我喜欢空。 如果没有 null,则 null 只会被替换为其他方式,让代码表示“我不知道,老兄!” (有些人会写“我不知道,伙计!”,或者“我不知道,老豆子!”等等,所以,我们会再次遇到完全相同的问题)。
我概括一下,我知道。
Null is not the problem - everyone treating, and interpreting null differently is the problem.
I like null. If there was no null, null would only be replaced with some other way for the code to say "I have no clue, dude!" (which some would write "I have no clue, man!", or "I have not a clue, old bean!" etc. and so, we'd have the exact same problems again).
I generalize, I know.