不带 null 的语言的最佳解释

发布于 2024-09-28 15:15:52 字数 302 浏览 6 评论 0原文

时常,当程序员抱怨空错误/异常时,有人会问没有空我们该怎么办。

我对选项类型的酷性有一些基本的了解,但我没有最好的表达它的知识或语言技能。对于以下内容,以普通程序员可以理解的方式编写的精彩解释是什么,我们可以向他们指出什么?

  • 默认情况下不希望引用/指针可为空
  • 选项类型如何工作,包括轻松检查空情况的策略,例如
    • 模式匹配和
    • 一元推导式
  • 替代解决方案,例如消息吃零
  • (我错过了其他方面)

Every so often when programmers are complaining about null errors/exceptions someone asks what we do without null.

I have some basic idea of the coolness of option types, but I don't have the knowledge or languages skill to best express it. What is a great explanation of the following written in a way approachable to the average programmer that we could point that person towards?

  • The undesirability of having references/pointers be nullable by default
  • How option types work including strategies to ease checking null cases such as
    • pattern matching and
    • monadic comprehensions
  • Alternative solution such as message eating nil
  • (other aspects I missed)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(11

财迷小姐 2024-10-05 15:15:53

到目前为止,所有答案都集中在为什么 null 是一件坏事,以及如果一种语言可以保证某些值永远不会为 null,那么它会多么方便。

然后他们继续建议,如果您强制所有值不可为空,这将是一个非常好的主意,如果您添加诸如 Option也许 表示可能并不总是具有定义值的类型。这是 Haskell 采取的方法。

都是好东西啊!但它并不排除使用显式可为空/非空类型来实现相同的效果。那么,为什么期权仍然是一件好事呢?毕竟,Scala 支持可为空值(必须,因此它可以与 Java 库一起使用),但也支持 Options

那么,除了能够从语言中完全删除空值之外,还有什么好处呢?

A. 组合

如果您将 null 感知代码简单地转换

def fullNameLength(p:Person) = {
  val middleLen =
    if (null == p.middleName)
      p.middleName.length
    else
      0
  p.firstName.length + middleLen + p.lastName.length
}

为选项感知代码,

def fullNameLength(p:Person) = {
  val middleLen = p.middleName match {
    case Some(x) => x.length
    case _ => 0
  }
  p.firstName.length + middleLen + p.lastName.length
}

那么没有太大区别!但这也是使用选项的一种糟糕方式...这种方法更加简洁:

def fullNameLength(p:Person) = {
  val middleLen = p.middleName map {_.length} getOrElse 0
  p.firstName.length + middleLen + p.lastName.length
}

甚至:

def fullNameLength(p:Person) =       
  p.firstName.length +
  p.middleName.map{length}.getOrElse(0) +
  p.lastName.length

当您开始处理选项列表时,它会变得更好。想象一下,列表 people 本身是可选的:

people flatMap(_ find (_.firstName == "joe")) map (fullNameLength)

这是如何工作的?

//convert an Option[List[Person]] to an Option[S]
//where the function f takes a List[Person] and returns an S
people map f

//find a person named "Joe" in a List[Person].
//returns Some[Person], or None if "Joe" isn't in the list
validPeopleList find (_.firstName == "joe")

//returns None if people is None
//Some(None) if people is valid but doesn't contain Joe
//Some[Some[Person]] if Joe is found
people map (_ find (_.firstName == "joe")) 

//flatten it to return None if people is None or Joe isn't found
//Some[Person] if Joe is found
people flatMap (_ find (_.firstName == "joe")) 

//return Some(length) if the list isn't None and Joe is found
//otherwise return None
people flatMap (_ find (_.firstName == "joe")) map (fullNameLength)

带有 null 检查(甚至 elvis ?: 运算符)的相应代码将非常长。这里真正的技巧是 flatMap 操作,它允许以可空值永远无法实现的方式嵌套理解选项和集合。

All of the answers so far focus on why null is a bad thing, and how it's kinda handy if a language can guarantee that certain values will never be null.

They then go on to suggest that it would be a pretty neat idea if you enforce non-nullability for all values, which can be done if you add a concept like Option or Maybe to represent types that may not always have a defined value. This is the approach taken by Haskell.

It's all good stuff! But it doesn't preclude the use of explicitly nullable / non-null types to achieve the same effect. Why, then, is Option still a good thing? After all, Scala supports nullable values (is has to, so it can work with Java libraries) but supports Options as well.

Q. So what are the benefits beyond being able to remove nulls from a language entirely?

A. Composition

If you make a naive translation from null-aware code

def fullNameLength(p:Person) = {
  val middleLen =
    if (null == p.middleName)
      p.middleName.length
    else
      0
  p.firstName.length + middleLen + p.lastName.length
}

to option-aware code

def fullNameLength(p:Person) = {
  val middleLen = p.middleName match {
    case Some(x) => x.length
    case _ => 0
  }
  p.firstName.length + middleLen + p.lastName.length
}

there's not much difference! But it's also a terrible way to use Options... This approach is much cleaner:

def fullNameLength(p:Person) = {
  val middleLen = p.middleName map {_.length} getOrElse 0
  p.firstName.length + middleLen + p.lastName.length
}

Or even:

def fullNameLength(p:Person) =       
  p.firstName.length +
  p.middleName.map{length}.getOrElse(0) +
  p.lastName.length

When you start dealing with List of Options, it gets even better. Imagine that the List people is itself optional:

people flatMap(_ find (_.firstName == "joe")) map (fullNameLength)

How does this work?

//convert an Option[List[Person]] to an Option[S]
//where the function f takes a List[Person] and returns an S
people map f

//find a person named "Joe" in a List[Person].
//returns Some[Person], or None if "Joe" isn't in the list
validPeopleList find (_.firstName == "joe")

//returns None if people is None
//Some(None) if people is valid but doesn't contain Joe
//Some[Some[Person]] if Joe is found
people map (_ find (_.firstName == "joe")) 

//flatten it to return None if people is None or Joe isn't found
//Some[Person] if Joe is found
people flatMap (_ find (_.firstName == "joe")) 

//return Some(length) if the list isn't None and Joe is found
//otherwise return None
people flatMap (_ find (_.firstName == "joe")) map (fullNameLength)

The corresponding code with null checks (or even elvis ?: operators) would be painfully long. The real trick here is the flatMap operation, which allows for the nested comprehension of Options and collections in a way that nullable values can never achieve.

最冷一天 2024-10-05 15:15:53

因为人们似乎忽略了它: null 是不明确的。

Alice 的出生日期为 null。这是什么意思?

Bob 的死亡日期为 null。这意味着什么?

一个“合理”的解释可能是爱丽丝的出生日期存在但未知,而鲍勃的死亡日期不存在(鲍勃还活着)。但为什么我们得到了不同的答案呢?


另一个问题:null 是一种边缘情况。

  • null = null吗?
  • nan = nan吗?
  • inf = inf吗?
  • +0 = -0吗?
  • +0/0 = -0/0吗?

答案通常分别是“是”、“否”、“是”、“是”、“否”、“是”。疯狂的“数学家”将 NaN 称为“无效”,并说它与自身比较等于。 SQL 将 null 视为不等于任何值(因此它们的行为类似于 NaN)。人们想知道当您尝试将 ±∞、±0 和 NaN 存储到同一数据库列中时会发生什么(有 253 NaN,其中一半是“负”)。

更糟糕的是,数据库在处理 NULL 的方式上有所不同,并且大多数不一致(请参阅 NULL 处理SQLite 中的概述)。这太可怕了。


现在是强制性的故事:

我最近设计了一个(sqlite3)数据库表,其中包含五列a NOT NULL,b,id_a,id_b NOT NULL,时间戳。因为它是一个通用模式,旨在解决相当任意的应用程序的通用问题,所以有两个唯一性约束:

UNIQUE(a, b, id_a)
UNIQUE(a, b, id_b)

id_a 的存在只是为了与现有应用程序设计兼容(部分原因是我还没有提出一个更好的解决方案),并且不在新应用程序中使用。由于 NULL 在 SQL 中的工作方式,我可以插入 (1, 2, NULL, 3, t)(1, 2, NULL, 4, t) 和不违反第一个唯一性约束(因为 (1, 2, NULL) != (1, 2, NULL))。

这之所以有效,是因为 NULL 在大多数数据库的唯一性约束中工作(大概是这样更容易模拟“现实世界”的情况,例如,没有两个人可以拥有相同的社会安全号码,但并非所有人都有一个)。


FWIW,如果不首先调用未定义的行为,C++ 引用就不能“指向”null,并且不可能使用未初始化的引用成员变量构造一个类(如果抛出异常,则构造失败)。

旁注:有时您可能需要互斥的指针(即只有其中一个可以为非 NULL),例如在假设的 iOS type DialogState = NotShown | 中。 UIActionSheet | 显示ActionSheet UIAlertView | 显示AlertView被解雇。。相反,我被迫做类似 assert((bool)actionSheet + (bool)alertView == 1) 的事情。

Since people seem to be missing it: null is ambiguous.

Alice's date-of-birth is null. What does it mean?

Bob's date-of-death is null. What does that mean?

A "reasonable" interpretation might be that Alice's date-of-birth exists but is unknown, whereas Bob's date-of-death does not exist (Bob is still alive). But why did we get to different answers?


Another problem: null is an edge case.

  • Is null = null?
  • Is nan = nan?
  • Is inf = inf?
  • Is +0 = -0?
  • Is +0/0 = -0/0?

The answers are usually "yes", "no", "yes", "yes", "no", "yes" respectively. Crazy "mathematicians" call NaN "nullity" and say it compares equal to itself. SQL treats nulls as not equal to anything (so they behave like NaNs). One wonders what happens when you try to store ±∞, ±0, and NaNs into the same database column (there are 253 NaNs, half of which are "negative").

To make matters worse, databases differ in how they treat NULL, and most of them aren't consistent (see NULL Handling in SQLite for an overview). It's pretty horrible.


And now for the obligatory story:

I recently designed a (sqlite3) database table with five columns a NOT NULL, b, id_a, id_b NOT NULL, timestamp. Because it's a generic schema designed to solve a generic problem for fairly arbitrary apps, there are two uniqueness constraints:

UNIQUE(a, b, id_a)
UNIQUE(a, b, id_b)

id_a only exists for compatibility with an existing app design (partly because I haven't come up with a better solution), and is not used in the new app. Because of the way NULL works in SQL, I can insert (1, 2, NULL, 3, t) and (1, 2, NULL, 4, t) and not violate the first uniqueness constraint (because (1, 2, NULL) != (1, 2, NULL)).

This works specifically because of how NULL works in a uniqueness constraint on most databases (presumably so it's easier to model "real-world" situations, e.g. no two people can have the same Social Security Number, but not all people have one).


FWIW, without first invoking undefined behaviour, C++ references cannot "point to" null, and it's not possible to construct a class with uninitialized reference member variables (if an exception is thrown, construction fails).

Sidenote: Occasionally you might want mutually-exclusive pointers (i.e. only one of them can be non-NULL), e.g. in a hypothetical iOS type DialogState = NotShown | ShowingActionSheet UIActionSheet | ShowingAlertView UIAlertView | Dismissed. Instead, I'm forced to do stuff like assert((bool)actionSheet + (bool)alertView == 1).

水染的天色ゝ 2024-10-05 15:15:53

默认情况下引用/指针可以为空是不可取的。

我认为这不是空值的主要问题,空值的主要问题是它们可能意味着两件事:

  1. 引用/指针未初始化:这里的问题与一般的可变性相同。其一,它使分析代码变得更加困难。
  2. 变量为 null 实际上意味着一些东西:这就是 Option 类型实际上形式化的情况。

支持选项类型的语言通常也禁止或阻止使用未初始化的变量。

选项类型如何工作,包括轻松检查空情况的策略,例如模式匹配。

为了有效,选项类型需要在语言中直接支持。否则需要大量样板代码来模拟它们。模式匹配和类型推断是两个关键的语言功能,使选项类型易于使用。例如:

在 F# 中:

//first we create the option list, and then filter out all None Option types and 
//map all Some Option types to their values.  See how type-inference shines.
let optionList = [Some(1); Some(2); None; Some(3); None]
optionList |> List.choose id //evaluates to [1;2;3]

//here is a simple pattern-matching example
//which prints "1;2;None;3;None;".
//notice how value is extracted from op during the match
optionList 
|> List.iter (function Some(value) -> printf "%i;" value | None -> printf "None;")

然而,在像 Java 这样不直接支持 Option 类型的语言中,我们会有类似的东西:

//here we perform the same filter/map operation as in the F# example.
List<Option<Integer>> optionList = Arrays.asList(new Some<Integer>(1),new Some<Integer>(2),new None<Integer>(),new Some<Integer>(3),new None<Integer>());
List<Integer> filteredList = new ArrayList<Integer>();
for(Option<Integer> op : list)
    if(op instanceof Some)
        filteredList.add(((Some<Integer>)op).getValue());

替代解决方案,例如消息吃 nil

Objective-C 的“消息吃 nil”是与其说是一种解决方案,不如说是一种减轻空检查麻烦的尝试。基本上,当尝试在 null 对象上调用方法时,表达式不会抛出运行时异常,而是计算结果为 null 本身。暂时搁置怀疑,就好像每个实例方法都以 if (this == null) return null; 开头。但随之而来的是信息丢失:您不知道该方法返回 null 是因为它是有效的返回值,还是因为该对象实际上是 null。这很像异常吞咽,并且在解决之前概述的 null 问题方面没有取得任何进展。

The undesirability of having having references/pointers be nullable by default.

I don't think this is the main issue with nulls, the main issue with nulls is that they can mean two things:

  1. The reference/pointer is uninitialized: the problem here is the same as mutability in general. For one, it makes it more difficult to analyze your code.
  2. The variable being null actually means something: this is the case which Option types actually formalize.

Languages which support Option types typically also forbid or discourage the use of uninitialized variables as well.

How option types work including strategies to ease checking null cases such as pattern matching.

In order to be effective, Option types need to be supported directly in the language. Otherwise it takes a lot of boiler-plate code to simulate them. Pattern-matching and type-inference are two keys language features making Option types easy to work with. For example:

In F#:

//first we create the option list, and then filter out all None Option types and 
//map all Some Option types to their values.  See how type-inference shines.
let optionList = [Some(1); Some(2); None; Some(3); None]
optionList |> List.choose id //evaluates to [1;2;3]

//here is a simple pattern-matching example
//which prints "1;2;None;3;None;".
//notice how value is extracted from op during the match
optionList 
|> List.iter (function Some(value) -> printf "%i;" value | None -> printf "None;")

However, in a language like Java without direct support for Option types, we'd have something like:

//here we perform the same filter/map operation as in the F# example.
List<Option<Integer>> optionList = Arrays.asList(new Some<Integer>(1),new Some<Integer>(2),new None<Integer>(),new Some<Integer>(3),new None<Integer>());
List<Integer> filteredList = new ArrayList<Integer>();
for(Option<Integer> op : list)
    if(op instanceof Some)
        filteredList.add(((Some<Integer>)op).getValue());

Alternative solution such as message eating nil

Objective-C's "message eating nil" is not so much a solution as an attempt to lighten the head-ache of null checking. Basically, instead of throwing a runtime exception when trying to invoke a method on a null object, the expression instead evaluates to null itself. Suspending disbelief, it's as if each instance method begins with if (this == null) return null;. But then there is information loss: you don't know whether the method returned null because it is valid return value, or because the object is actually null. It's a lot like exception swallowing, and doesn't make any progress addressing the issues with null outlined before.

世界如花海般美丽 2024-10-05 15:15:53

汇编给我们带来了地址,也称为无类型指针。 C 将它们直接映射为类型化指针,但引入了 Algol 的 null 作为唯一的指针值,与所有类型化指针兼容。 C 中 null 的一个大问题是,由于每个指针都可以为 null,因此如果不进行手动检查,就永远无法安全地使用指针。

在高级语言中,使用 null 很尴尬,因为它实际上传达了两个不同的概念:

  • 告诉某些东西是未定义
  • 告诉某些事情是可选的

未定义的变量几乎没有任何用处,并且每当它们发生时都会产生未定义的行为。我想每个人都会同意,应该不惜一切代价避免未定义的事情。

第二种情况是可选性,最好明确提供,例如使用选项类型


假设我们在一家运输公司,我们需要创建一个应用程序来帮助我们的司机制定时间表。对于每个司机,我们都会存储一些信息,例如:他们拥有的驾驶执照以及紧急情况下拨打的电话号码。

在 C 中,我们可以:

struct PhoneNumber { ... };
struct MotorbikeLicence { ... };
struct CarLicence { ... };
struct TruckLicence { ... };

struct Driver {
  char name[32]; /* Null terminated */
  struct PhoneNumber * emergency_phone_number;
  struct MotorbikeLicence * motorbike_licence;
  struct CarLicence * car_licence;
  struct TruckLicence * truck_licence;
};

正如您所观察到的,在对驱动程序列表进行的任何处理中,我们都必须检查空指针。编译器不会帮你,程序的安全就靠你的肩了。

在 OCaml 中,相同的代码如下所示:

type phone_number = { ... }
type motorbike_licence = { ... }
type car_licence = { ... }
type truck_licence = { ... }

type driver = {
  name: string;
  emergency_phone_number: phone_number option;
  motorbike_licence: motorbike_licence option;
  car_licence: car_licence option;
  truck_licence: truck_licence option;
}

现在假设我们要打印所有司机的姓名及其卡车执照号码。

在 C 中:

#include <stdio.h>

void print_driver_with_truck_licence_number(struct Driver * driver) {
  /* Check may be redundant but better be safe than sorry */
  if (driver != NULL) {
    printf("driver %s has ", driver->name);
    if (driver->truck_licence != NULL) {
      printf("truck licence %04d-%04d-%08d\n",
        driver->truck_licence->area_code
        driver->truck_licence->year
        driver->truck_licence->num_in_year);
    } else {
      printf("no truck licence\n");
    }
  }
}

void print_drivers_with_truck_licence_numbers(struct Driver ** drivers, int nb) {
  if (drivers != NULL && nb >= 0) {
    int i;
    for (i = 0; i < nb; ++i) {
      struct Driver * driver = drivers[i];
      if (driver) {
        print_driver_with_truck_licence_number(driver);
      } else {
        /* Huh ? We got a null inside the array, meaning it probably got
           corrupt somehow, what do we do ? Ignore ? Assert ? */
      }
    }
  } else {
    /* Caller provided us with erroneous input, what do we do ?
       Ignore ? Assert ? */
  }
}

在 OCaml 中,这将是:

open Printf

(* Here we are guaranteed to have a driver instance *)
let print_driver_with_truck_licence_number driver =
  printf "driver %s has " driver.name;
  match driver.truck_licence with
    | None ->
        printf "no truck licence\n"
    | Some licence ->
        (* Here we are guaranteed to have a licence *)
        printf "truck licence %04d-%04d-%08d\n"
          licence.area_code
          licence.year
          licence.num_in_year

(* Here we are guaranteed to have a valid list of drivers *)
let print_drivers_with_truck_licence_numbers drivers =
  List.iter print_driver_with_truck_licence_number drivers

正如您在这个简单的示例中看到的,安全版本中没有什么复杂的:

  • 它更简洁。
  • 您可以获得更好的保证,并且根本不需要空检查。
  • 编译器确保您正确处理该选项

,而在 C 中,您可能只是忘记了空检查和繁荣...

注意:这些代码示例未编译,但我希望您明白了。

Assembly brought us addresses also known as untyped pointers. C mapped them directly as typed pointers but introduced Algol's null as a unique pointer value, compatible with all typed pointers. The big issue with null in C is that since every pointer can be null, one never can use a pointer safely without a manual check.

In higher-level languages, having null is awkward since it really conveys two distinct notions:

  • Telling that something is undefined.
  • Telling that something is optional.

Having undefined variables is pretty much useless, and yields to undefined behavior whenever they occur. I suppose everybody will agree that having things undefined should be avoided at all costs.

The second case is optionality and is best provided explicitly, for instance with an option type.


Let's say we're in a transport company and we need to create an application to help create a schedule for our drivers. For each driver, we store a few informations such as: the driving licences they have and the phone number to call in case of emergency.

In C we could have:

struct PhoneNumber { ... };
struct MotorbikeLicence { ... };
struct CarLicence { ... };
struct TruckLicence { ... };

struct Driver {
  char name[32]; /* Null terminated */
  struct PhoneNumber * emergency_phone_number;
  struct MotorbikeLicence * motorbike_licence;
  struct CarLicence * car_licence;
  struct TruckLicence * truck_licence;
};

As you observe, in any processing over our list of drivers we'll have to check for null pointers. The compiler won't help you, the safety of the program relies on your shoulders.

In OCaml, the same code would look like this:

type phone_number = { ... }
type motorbike_licence = { ... }
type car_licence = { ... }
type truck_licence = { ... }

type driver = {
  name: string;
  emergency_phone_number: phone_number option;
  motorbike_licence: motorbike_licence option;
  car_licence: car_licence option;
  truck_licence: truck_licence option;
}

Let's now say that we want to print the names of all the drivers along with their truck licence numbers.

In C:

#include <stdio.h>

void print_driver_with_truck_licence_number(struct Driver * driver) {
  /* Check may be redundant but better be safe than sorry */
  if (driver != NULL) {
    printf("driver %s has ", driver->name);
    if (driver->truck_licence != NULL) {
      printf("truck licence %04d-%04d-%08d\n",
        driver->truck_licence->area_code
        driver->truck_licence->year
        driver->truck_licence->num_in_year);
    } else {
      printf("no truck licence\n");
    }
  }
}

void print_drivers_with_truck_licence_numbers(struct Driver ** drivers, int nb) {
  if (drivers != NULL && nb >= 0) {
    int i;
    for (i = 0; i < nb; ++i) {
      struct Driver * driver = drivers[i];
      if (driver) {
        print_driver_with_truck_licence_number(driver);
      } else {
        /* Huh ? We got a null inside the array, meaning it probably got
           corrupt somehow, what do we do ? Ignore ? Assert ? */
      }
    }
  } else {
    /* Caller provided us with erroneous input, what do we do ?
       Ignore ? Assert ? */
  }
}

In OCaml that would be:

open Printf

(* Here we are guaranteed to have a driver instance *)
let print_driver_with_truck_licence_number driver =
  printf "driver %s has " driver.name;
  match driver.truck_licence with
    | None ->
        printf "no truck licence\n"
    | Some licence ->
        (* Here we are guaranteed to have a licence *)
        printf "truck licence %04d-%04d-%08d\n"
          licence.area_code
          licence.year
          licence.num_in_year

(* Here we are guaranteed to have a valid list of drivers *)
let print_drivers_with_truck_licence_numbers drivers =
  List.iter print_driver_with_truck_licence_number drivers

As you can see in this trivial example, there is nothing complicated in the safe version:

  • It's terser.
  • You get much better guarantees and no null check is required at all.
  • The compiler ensured that you correctly dealt with the option

Whereas in C, you could just have forgotten a null check and boom...

Note : these code samples where not compiled, but I hope you got the ideas.

哆兒滾 2024-10-05 15:15:53

微软研究院有一个有趣的项目,叫做

规格#

它是一个 C# 扩展,具有非空类型和一些机制来检查对象是否不为空,尽管恕我直言,应用了按合同设计 原则对于许多由空引用引起的麻烦情况可能更合适,更有帮助。

Microsoft Research has a intersting project called

Spec#

It is a C# extension with not-null type and some mechanism to check your objects against not being null, although, IMHO, applying the design by contract principle may be more appropriate and more helpful for many troublesome situations caused by null references.

北渚 2024-10-05 15:15:53

Robert Nystrom 在这里提供了一篇不错的文章:

http:// Journal.stuffwithstuff.com/2010/08/23/void-null-maybe-and-nothing/

描述了他在向他的Magpie 编程语言。

Robert Nystrom offers a nice article here:

http://journal.stuffwithstuff.com/2010/08/23/void-null-maybe-and-nothing/

describing his thought process when adding support for absence and failure to his Magpie programming language.

停滞 2024-10-05 15:15:53

来自.NET背景,我一直认为null有道理,它很有用。直到我了解了结构体以及使用它们是多么容易,避免了大量的样板代码。 Tony Hoare 2009 年在伦敦 QCon 演讲,为发明空引用而道歉。引用他的话:

我称之为我的十亿美元错误。这是零的发明
1965年参考。当时,我正在设计第一个
面向对象引用的综合类型系统
语言(ALGOL W)。我的目标是确保所有参考文献的使用
应该是绝对安全的,由自动执行检查
编译器。但我无法抗拒输入 null 的诱惑
参考,只是因为它很容易实现。这导致了
无数的错误、漏洞和系统崩溃,
在过去四十年里可能造成了十亿美元的痛苦和损失
年。近年来出现了一些程序分析器,如 PREfix 和
微软的PREfast已经被用来检查参考文献,并给出
如果存在风险,则警告它们可能为非空。最近的
像 Spec# 这样的编程语言已经引入了声明
非空引用。这就是我在 1965 年拒绝的解决方案。

也请参阅此问题 程序员

Coming from .NET background, I always thought null had a point, its useful. Until I came to know of structs and how easy it was working with them avoiding a lot of boilerplate code. Tony Hoare speaking at QCon London in 2009, apologized for inventing the null reference. To quote him:

I call it my billion-dollar mistake. It was the invention of the null
reference in 1965. At that time, I was designing the first
comprehensive type system for references in an object oriented
language (ALGOL W). My goal was to ensure that all use of references
should be absolutely safe, with checking performed automatically by
the compiler. But I couldn't resist the temptation to put in a null
reference, simply because it was so easy to implement. This has led to
innumerable errors, vulnerabilities, and system crashes, which have
probably caused a billion dollars of pain and damage in the last forty
years. In recent years, a number of program analysers like PREfix and
PREfast in Microsoft have been used to check references, and give
warnings if there is a risk they may be non-null. More recent
programming languages like Spec# have introduced declarations for
non-null references. This is the solution, which I rejected in 1965.

See this question too at programmers

月寒剑心 2024-10-05 15:15:53

我一直将 Null(或 nil)视为缺少值

有时您想要这个,有时则不需要。这取决于您正在使用的域。如果缺席是有意义的:没有中间名,那么您的应用程序可以采取相应的行动。另一方面,如果 null 值不应该存在:名字为 null,那么开发人员就会接到众所周知的凌晨 2 点的电话。

我还看到代码因检查 null 而超载且过于复杂。对我来说,这意味着以下两件事之一:
a) 应用程序树中较高位置的错误
b) 糟糕/不完整的设计

从积极的一面来看——空可能是检查某些东西是否缺失的更有用的概念之一,而没有空概念的语言在进行数据验证时最终会使事情变得过于复杂。在这种情况下,如果新变量未初始化,则所述语言通常会将变量设置为空字符串、0 或空集合。但是,如果空字符串、0 或空集合对于您的应用程序来说是有效值,那么您就会遇到问题。

有时,通过为字段发明特殊/奇怪的值来表示未初始化的状态来避免这种情况。但是,当善意的用户输入特殊值时会发生什么呢?我们不要陷入这将导致数据验证例程混乱的情况。
如果语言支持 null 概念,那么所有的担忧都会消失。

I've always looked at Null (or nil) as being the absence of a value.

Sometimes you want this, sometimes you don't. It depends on the domain you are working with. If the absence is meaningful: no middle name, then your application can act accordingly. On the other hand if the null value should not be there: The first name is null, then the developer gets the proverbial 2 a.m. phone call.

I've also seen code overloaded and over-complicated with checks for null. To me this means one of two things:
a) a bug higher up in the application tree
b) bad/incomplete design

On the positive side - Null is probably one of the more useful notions for checking if something is absent, and languages without the concept of null will endup over-complicating things when it's time to do data validation. In this case, if a new variable is not initialized, said languagues will usually set variables to an empty string, 0, or an empty collection. However, if an empty string or 0 or empty collection are valid values for your application -- then you have a problem.

Sometimes this circumvented by inventing special/weird values for fields to represent an uninitialized state. But then what happens when the special value is entered by a well-intentioned user? And let's not get into the mess this will make of data validation routines.
If the language supported the null concept all the concerns would vanish.

南薇 2024-10-05 15:15:53

向量语言有时可以不用空值。

在这种情况下,空向量充当类型化空值。

Vector languages can sometimes get away with not having a null.

The empty vector serves as a typed null in this case.

沫尐诺 2024-10-05 15:15:52

我认为为什么 null 是不可取的,简单总结就是无意义的状态不应该被表示

假设我正在建模一扇门。它可以处于三种状态之一:打开、关闭但未锁定以及关闭并锁定。现在我可以按照以下方式对其进行建模

class Door
    private bool isShut
    private bool isLocked

,并且很清楚如何将我的三个状态映射到这两个布尔变量。但这留下了第四种不需要的状态:isShut==false && isLocked==true。因为我选择作为表示的类型承认这种状态,所以我必须花费精力来确保类永远不会进入这种状态(也许通过显式编码不变量)。相反,如果我使用具有代数数据类型或允许我定义的检查枚举的语言

type DoorState =
    | Open | ShutAndUnlocked | ShutAndLocked

,那么我就可以定义

class Door
    private DoorState state

并且不再需要担心。类型系统将确保 class Door 的实例仅存在三种可能的状态。这就是类型系统所擅长的 - 在编译时明确排除整个类的错误。

null 的问题在于,每个引用类型都会在其空间中获得通常不希望出现的这种额外状态。 string 变量可以是任何字符序列,也可以是这个疯狂的额外 null 值,该值不会映射到我的问题域。一个 Triangle 对象具有三个 Point,它们本身具有 XY 值,但不幸的是 点或三角形本身可能是这个疯狂的空值,对于我正在使用的图形领域来说毫无意义。等等。

当您确实打算对可能不存在的对象进行建模时值,那么您应该明确选择它。如果我打算对人进行建模的方式是每个 Person 都有一个 FirstName 和一个 LastName,但只有某些人有 MiddleName< /code>s,那么我想说的是,

class Person
    private string FirstName
    private Option<string> MiddleName
    private string LastName

这里的 string 被假定为不可空类型。这样,在尝试计算某人姓名的长度时,就不需要建立棘手的不变量,也不会出现意外的 NullReferenceException 异常。类型系统确保任何处理 MiddleName 的代码都考虑到它为 None 的可能性,而任何处理 FirstName 的代码都可以安全地假设那里有一个值。

例如,使用上面的类型,我们可以编写这个愚蠢的函数:

let TotalNumCharsInPersonsName(p:Person) =
    let middleLen = match p.MiddleName with
                    | None -> 0
                    | Some(s) -> s.Length
    p.FirstName.Length + middleLen + p.LastName.Length

不用担心。相比之下,在对字符串等类型具有可空引用的语言中,假设

class Person
    private string FirstName
    private string MiddleName
    private string LastName

您最终编写了一些内容,

let TotalNumCharsInPersonsName(p:Person) =
    p.FirstName.Length + p.MiddleName.Length + p.LastName.Length

如果传入的 Person 对象不具有所有内容都为非空的不变式,或者

let TotalNumCharsInPersonsName(p:Person) =
    (if p.FirstName=null then 0 else p.FirstName.Length)
    + (if p.MiddleName=null then 0 else p.MiddleName.Length)
    + (if p.LastName=null then 0 else p.LastName.Length)

可能

let TotalNumCharsInPersonsName(p:Person) =
    p.FirstName.Length
    + (if p.MiddleName=null then 0 else p.MiddleName.Length)
    + p.LastName.Length

假设 p 确保第一个/最后一个存在,但中间可以为空,或者您可能会进行检查以引发不同类型的异常,或者谁知道会发生什么。所有这些疯狂的实现选择和需要考虑的事情突然出现,因为有你不想要或不需要的愚蠢的可表示值。

Null 通常会增加不必要的复杂性。复杂性是所有软件的敌人,只要合理,您就应该努力降低复杂性。

(请注意,即使这些简单的示例也更加复杂。即使 FirstName 不能为 nullstring 也可以表示 “”(空字符串),这可能也不是我们想要建模的人名,因此,即使使用不可为空的字符串,我们仍然可能“代表无意义的值”。同样,您可以选择通过运行时的不变量和条件代码来解决这个问题,或者使用类型系统(例如使用 NonEmptyString 类型),后者可能是不明智的(“好”)。 “类型通常在一组常见操作上“封闭”,例如 NonEmptyString 并未在 .SubString(0,0) 上封闭),但它演示了更多要点归根结底,在任何给定的类型系统中,都有一些它很容易消除的复杂性,而另一些复杂性本质上很难消除。在几乎每个类型系统中,从“默认可为空引用”到“默认不可为空引用”的更改几乎总是一个简单的更改,使类型系统能够更好地应对复杂性并排除某些类型的错误和无意义的状态。所以这么多语言不断重复这个错误真是太疯狂了。)

I think the succinct summary of why null is undesirable is that meaningless states should not be representable.

Suppose I'm modeling a door. It can be in one of three states: open, shut but unlocked, and shut and locked. Now I could model it along the lines of

class Door
    private bool isShut
    private bool isLocked

and it is clear how to map my three states into these two boolean variables. But this leaves a fourth, undesired state available: isShut==false && isLocked==true. Because the types I have selected as my representation admit this state, I must expend mental effort to ensure that the class never gets into this state (perhaps by explicitly coding an invariant). In contrast, if I were using a language with algebraic data types or checked enumerations that lets me define

type DoorState =
    | Open | ShutAndUnlocked | ShutAndLocked

then I could define

class Door
    private DoorState state

and there are no more worries. The type system will ensure that there are only three possible states for an instance of class Door to be in. This is what type systems are good at - explicitly ruling out a whole class of errors at compile-time.

The problem with null is that every reference type gets this extra state in its space that is typically undesired. A string variable could be any sequence of characters, or it could be this crazy extra null value that doesn't map into my problem domain. A Triangle object has three Points, which themselves have X and Y values, but unfortunately the Points or the Triangle itself might be this crazy null value that is meaningless to the graphing domain I'm working in. Etc.

When you do intend to model a possibly-non-existent value, then you should opt into it explicitly. If the way I intend to model people is that every Person has a FirstName and a LastName, but only some people have MiddleNames, then I would like to say something like

class Person
    private string FirstName
    private Option<string> MiddleName
    private string LastName

where string here is assumed to be a non-nullable type. Then there are no tricky invariants to establish and no unexpected NullReferenceExceptions when trying to compute the length of someone's name. The type system ensures that any code dealing with the MiddleName accounts for the possibility of it being None, whereas any code dealing with the FirstName can safely assume there is a value there.

So for example, using the type above, we could author this silly function:

let TotalNumCharsInPersonsName(p:Person) =
    let middleLen = match p.MiddleName with
                    | None -> 0
                    | Some(s) -> s.Length
    p.FirstName.Length + middleLen + p.LastName.Length

with no worries. In contrast, in a language with nullable references for types like string, then assuming

class Person
    private string FirstName
    private string MiddleName
    private string LastName

you end up authoring stuff like

let TotalNumCharsInPersonsName(p:Person) =
    p.FirstName.Length + p.MiddleName.Length + p.LastName.Length

which blows up if the incoming Person object does not have the invariant of everything being non-null, or

let TotalNumCharsInPersonsName(p:Person) =
    (if p.FirstName=null then 0 else p.FirstName.Length)
    + (if p.MiddleName=null then 0 else p.MiddleName.Length)
    + (if p.LastName=null then 0 else p.LastName.Length)

or maybe

let TotalNumCharsInPersonsName(p:Person) =
    p.FirstName.Length
    + (if p.MiddleName=null then 0 else p.MiddleName.Length)
    + p.LastName.Length

assuming that p ensures first/last are there but middle can be null, or maybe you do checks that throw different types of exceptions, or who knows what. All these crazy implementation choices and things to think about crop up because there's this stupid representable-value that you don't want or need.

Null typically adds needless complexity. Complexity is the enemy of all software, and you should strive to reduce complexity whenever reasonable.

(Note well that there is more complexity to even these simple examples. Even if a FirstName cannot be null, a string can represent "" (the empty string), which is probably also not a person name that we intend to model. As such, even with non-nullable strings, it still might be the case that we are "representing meaningless values". Again, you could choose to battle this either via invariants and conditional code at runtime, or by using the type system (e.g. to have a NonEmptyString type). The latter is perhaps ill-advised ("good" types are often "closed" over a set of common operations, and e.g. NonEmptyString is not closed over .SubString(0,0)), but it demonstrates more points in the design space. At the end of the day, in any given type system, there is some complexity it will be very good at getting rid of, and other complexity that is just intrinsically harder to get rid of. The key for this topic is that in nearly every type system, the change from "nullable references by default" to "non-nullable references by default" is nearly always a simple change that makes the type system a great deal better at battling complexity and ruling out certain types of errors and meaningless states. So it is pretty crazy that so many languages keep repeating this error again and again.)

青朷 2024-10-05 15:15:52

选项类型的好处并不在于它们是可选的。 所有其他类型都不是

有时,我们需要能够表示一种“空”状态。有时我们必须表示“无值”选项以及变量可能采用的其他可能值。因此,一种完全不允许这样做的语言将会有点瘫痪。

通常,我们不需要它,并且允许这样的“null”状态只会导致歧义和混乱:每次我访问.NET中的引用类型变量时,我必须考虑它可能为空

通常,它实际上永远不会为空,因为程序员构造代码以使其永远不会发生。但编译器无法验证这一点,每次看到它时,你都必须问自己“这可以是 null 吗?我需要在这里检查 null 吗?”

理想情况下,在许多情况下 null 没有意义,不应该允许它

在 .NET 中实现这一点很困难,因为几乎所有内容都可以为 null。您必须依赖您所调用的代码的作者 100% 遵守纪律和一致,并清楚地记录什么可以为 null,什么不能为 null,否则您必须偏执并检查所有内容

但是,如果类型默认情况下不可为空,则无需检查它们是否为空。您知道它们永远不能为空,因为编译器/类型检查器会为您强制执行此操作。

然后,我们只需要一个后门,以应对极少数需要处理空状态的情况。然后可以使用“选项”类型。然后,在我们有意识地决定需要能够表示“无值”情况的情况下,我们允许 null,而在所有其他情况下,我们知道该值永远不会为 null。

正如其他人提到的,例如在 C# 或 Java 中,null 可能意味着以下两种情况之一:

  1. 变量未初始化。理想情况下,这种情况应该永远不会发生。变量除非被初始化,否则不应存在
  2. 该变量包含一些“可选”数据:它需要能够表示没有数据的情况。这有时是必要的。也许您正在尝试在列表中查找某个对象,但您事先并不知道它是否存在。然后我们需要能够表示“没有找到对象”。

第二个含义必须保留,但第一个含义应该完全消除。即使是第二个含义也不应该是默认的。 如果我们需要它,我们可以选择加入它。但是,当我们不需要某些东西是可选的时,我们希望类型检查器保证它永远不会为空。

The nice thing about option types isn't that they're optional. It is that all other types aren't.

Sometimes, we need to be able to represent a kind of "null" state. Sometimes we have to represent a "no value" option as well as the other possible values a variable may take. So a language that flat out disallows this is going to be a bit crippled.

But often, we don't need it, and allowing such a "null" state only leads to ambiguity and confusion: every time I access a reference type variable in .NET, I have to consider that it might be null.

Often, it will never actually be null, because the programmer structures the code so that it can never happen. But the compiler can't verify that, and every single time you see it, you have to ask yourself "can this be null? Do I need to check for null here?"

Ideally, in the many cases where null doesn't make sense, it shouldn't be allowed.

That's tricky to achieve in .NET, where nearly everything can be null. You have to rely on the author of the code you're calling to be 100% disciplined and consistent and have clearly documented what can and cannot be null, or you have to be paranoid and check everything.

However, if types aren't nullable by default, then you don't need to check whether or not they're null. You know they can never be null, because the compiler/type checker enforces that for you.

And then we just need a back door for the rare cases where we do need to handle a null state. Then an "option" type can be used. Then we allow null in the cases where we've made a conscious decision that we need to be able to represent the "no value" case, and in every other case, we know that the value will never be null.

As others have mentioned, in C# or Java for example, null can mean one of two things:

  1. the variable is uninitialized. This should, ideally, never happen. A variable shouldn't exist unless it is initialized.
  2. the variable contains some "optional" data: it needs to be able to represent the case where there is no data. This is sometimes necessary. Perhaps you're trying to find an object in a list, and you don't know in advance whether or not it's there. Then we need to be able to represent that "no object was found".

The second meaning has to be preserved, but the first one should be eliminated entirely. And even the second meaning should not be the default. It's something we can opt in to if and when we need it. But when we don't need something to be optional, we want the type checker to guarantee that it will never be null.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文