SQL Server 2008 - 默认列值 - 我应该使用 null 还是空字符串?
一段时间以来,我一直在争论是否应该留下我不知道是否会传入数据的列,并将值设置为空字符串('')或只允许为空。
我想听听这里推荐的做法是什么。
如果有什么不同,我将使用 C# 作为消费应用程序。
For some time i'm debating if i should leave columns which i don't know if data will be passed in and set the value to empty string ('') or just allow null.
i would like to hear what is the recommended practice here.
if it makes a difference, i'm using c# as the consuming application.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
恐怕……
这要看情况!
这个问题没有单一的答案。
正如其他回复中所指出的,在 SQL 级别,NULL 和空字符串的语义截然不同,前者表示该值未知,后者表示该值这是“看不见的东西”(在显示和报告中),但它仍然是“已知值”。在这种情况下,常见的一个例子是中间名。 “middle_name”列中的空值表示我们不知道底层人员是否有中间名,如果有的话,这个名字是什么,空字符串表示我们“知道”这个人< em>没有有中间名。
也就是说,对于给定的列,另外两种因素可能会帮助您在这些选项之间进行选择。
数据语义
例如,了解空字符串是否是基础数据的有效值非常重要。如果是这种情况,如果我们还使用空字符串表示“未知信息”,则可能会丢失信息。另一个考虑因素是,当我们没有该列的信息时,是否可以使用某些替代值;也许“n/a”或“未指定”或“tbd”是更好的值。
SQL 行为和实用程序
考虑到 SQL 行为,使用或不使用 NULL 的选择可能是由空间考虑、创建过滤索引的愿望或 COALESCE() 函数的便利性(可以用 CASE 语句模拟,但以更详细的方式)。另一个考虑因素是任何查询是否可能尝试查询多个列以附加它们(如 SELECT name + ', ' + middle_name AS LongName 等)。
除了 NULL 与空字符串选择的有效性之外,在给定情况下,一般考虑尝试并尽可能保持一致,即尝试并坚持一种特定方式,并且仅/有意/明确地偏离此方式出于充分的理由并且在少数情况下这样做。
I'm afraid that...
it depends!
There is no single answer to this question.
As indicated in other responses, at the level of SQL, NULL and empty string have very different semantics, the former indicating that the value is unknown, the latter indicating that the value is this "invisible thing" (in displays and report), but none the less it a "known value". A example commonly given in this context is that of the middle name. A null value in the "middle_name" column would indicate that we do not know whether the underlying person has a middle name or not, and if so what this name is, an empty string would indicate that we "know" that this person does not have a middle name.
This said, two other kinds of factors may help you choose between these options, for a given column.
Data semantics
For example it is important to know if the empty-string is a valid value for the underlying data. If that is the case, we may loose information if we also use empty string for "unknown info". Another consideration is whether some alternate value may be used in the case when we do not have info for the column; Maybe 'n/a' or 'unspecified' or 'tbd' are better values.
SQL behavior and utilities
Considering SQL behavior, the choice of using or not using NULL, may be driven by space consideration, by the desire to create a filtered index, or also by the convenience of the COALESCE() function (which can be emulated with CASE statements, but in a more verbose fashion). Another consideration is whether any query may attempt to query multiple columns to append them (as in SELECT name + ', ' + middle_name AS LongName etc.).
Beyond the validity of the choice of NULL vs. empty string, in given situation, a general consideration it to try and be as consistent as possible, i.e. to try and stick to ONE particular way, and to only/purposely/explicitly depart from this way for good reasons and in few cases.
如果没有值,请勿使用空字符串。如果您需要知道某个值是否未知,请为其设置一个标志。但十有八九,如果不提供信息,那就是未知的,但这没关系。
Don't use empty string if there is no value. If you need to know if a value is unknown, have a flag for it. But 9 times out of 10, if the information is not provided, it's unknown, and that's fine.
NULL
表示未知值。空字符串表示已知值 - 长度为零的字符串。这些是完全不同的事情。NULL
means unknown value. An empty string means a known value - a string with length zero. These are totally different things.当我想要一个可能会或可能不会更改的有效默认值(例如用户的中间名)时,为空。
如果后续代码未显式设置该值,则为 NULL。
但是,通过使用 Empty 值而不是 null 初始化字符串,可以减少发生 NullReferenceException 的机会。
empty when I want a valid default value that may or may not be changed, for example, a user's middle name.
NULL when it is an error if the ensuing code does not set the value explicitly.
However, By initializing strings with the Empty value instead of null, you can reduce the chances of a NullReferenceException occurring.
抛开理论不谈,我倾向于将
在这种情况下,我可能会使用 NULL。
一件重要的事情是要保持一致:混合 NULL 和空字符串将会以泪水告终。
在实际实现级别上,空字符串在 SQL Server 中占用 2 个字节,其中 NULL 是位图的。在某些情况下,对于宽/较大的表,它会产生不同的性能,因为需要移动更多数据。
Theory aside, I tend to view:
In this case, I'd probably use NULL.
One important thing is to be consistent: mixing NULLs and empty strings will end in tears.
On a practical implementation level, empty string takes 2 bytes in SQL Server where as NULLs are bitmapped. In some conditions and for wide/larger tables it makes a different in performance because it's more data to shift around.