在设计数据库时,存储多个真/假值的首选方式是什么?
如标题中所述,在设计数据库时,处理具有多个列的表的首选方法是什么,这些列仅将真/假值存储为单个或值(例如“Y/N:或“0/1”) )?同样,不同数据库(例如 Oracle 和 SQL Server)之间是否可能出现一些问题,这些问题可能会影响列的处理方式?
As stated in the title, when designing databases, what is the preferred way to handle tables that have multiple columns that are just storing true / false values as just a single either or value (e.g. "Y/N: or "0/1")? Likewise, are there some issues that might arise between different databases (e.g. Oracle and SQL Server) that might affect how the columns are handled?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(10)
在
SQL Server
中,有BIT
数据类型。您可以在那里存储 0 或 1,比较这些值,但不运行MIN
或MAX
。在
Oracle
中,您只需使用NUMBER
或CHAR(1)
。在
MySQL
和PostgreSQL
中,任何数据类型都可以隐式转换为BOOLEAN
。两个系统都支持
BOOLEAN
数据类型,您可以在WHERE
或ON
子句中按原样使用该数据类型,无需运算符:,这在
WHERE
或ON
子句中是不可能的code>SQL Server 和Oracle
(您需要在那里有某种类型或谓词)。在
MySQL
中,BOOLEAN
是TINYINT(1)
的同义词。在 PostgreSQL 中也是如此(就存储而言),但从逻辑上讲,它不能隐式转换为任何其他类型。
In
SQL Server
, there isBIT
datatype. You can store 0 or 1 there, compare the values but not runMIN
orMAX
.In
Oracle
, you just useNUMBER
orCHAR(1)
.In
MySQL
andPostgreSQL
any datatype is implicitly convertible toBOOLEAN
.Both systems support
BOOLEAN
datatype which you can use as is, without the operators, in theWHERE
orON
clauses:, which is impossible in
SQL Server
andOracle
(you need to have some kind or a predicate there).In
MySQL
,BOOLEAN
is a synonym forTINYINT(1)
.In
PostgreSQL
too (in terms of storage), but logically, it's not implicitly convertible to any other type.根据我自己的经验,我更喜欢用 char(1) 来表示“Y”或“N”。使用 0 和 1 可能会有点混乱,具体取决于我已经喝了多少啤酒,并且 C++ main() 函数成功时返回 0。 ENUM 和 BIT 类型带来的麻烦超过了它们的价值。
有趣的是,MySQL
information_schema
使用 VARCHAR(3) 表示“YES”或“NO”。例子:
From my own experience, I prefer char(1) for 'Y' or 'N'. Using 0 and 1 can be a bit confusing depending on how many beers I already drank and C++ main() function return 0 on success. ENUM and BIT types are more trouble than they are worth.
It is interesting to note that MySQL
information_schema
uses VARCHAR(3) for 'YES' or 'NO'.Example:
您可能需要考虑另一种数据模型来存储布尔值,而不是布尔数据类型,这在以下情况下可能特别合适:
定义用户权限可能是上述的典型示例。请考虑下表:
在
Permissions
表中,您将定义可能适用于用户的所有可能权限。您必须为每个是/否属性在Permissions
表中添加一行。您可能已经注意到,这使得将来添加新权限变得非常容易,而无需修改数据库架构。对于上述模型,您可以通过在
Users_Permissions
表中分配user_id
和permission_id
来指示 TRUE 值。否则默认为 FALSE。例如:
优点
Permissions
和Users_Permissions
表中。)您可以轻松存储有关每个事实的更多信息。缺点
Users_Permissions
表中)。否则,您可以在Users_Permissions
表,还允许您存储审计跟踪信息,例如权限何时修改以及由谁修改。如果删除该行,您将无法存储此信息。Instead of boolean datatypes, you may want to consider another data model for storing boolean values, which may be particularly appropriate in the following cases:
Defining user permissions may be a typical example of the above. Consider the following tables:
In the
Permissions
table you would define all the possible permissions that may be applicable to users. You would have to add a row to thePermissions
table for each yes/no attribute. As you may have noticed, this makes it very easy to add new permissions in the future without having to modify the database schema.With the above model, you would then indicate a TRUE value by assigning a
user_id
with apermission_id
in theUsers_Permissions
table. Otherwise it would be FALSE by default.For example:
Advantages
Permissions
andUsers_Permissions
tables.) You can easily store more information on each fact.Disadvantages
Users_Permissions
table.) Otherwise you can use a 'deleted' flag in theUsers_Permissions
table, which would also allow you to store information for audit trails, such as when a permission was modified and by whom. If you delete the row, you would not be able to store this information.使用对您正在使用的特定数据库引擎有意义的任何内容。这是需要处理它的数据库接口。如果数据库的代码端接口足够模块化,那么只需进行简单的一行更改即可处理底层数据库中的不同布尔类型。
Use whatever makes sense for the specific database engine you're using. It's the interface to the database that needs to handle it. If the code-side interface to the database is sufficiently modularized, then it will be nothing more than a simple one-line change to handle a different boolean type in the underlying database.
我认为“Y/N”值比“1/0”更有意义。对于 Oracle,我将执行以下操作,以便数据库引擎尽可能多地验证数据:
值限制为“in ('Y',
'N')
使它们不可为空 - 这可以
当你隐式地预防问题时
假设任何不是“Y”的东西
SQL 中的值为“N”
I think "Y/N" values are more meaningful than "1/0". With Oracle, I would do the following to have the data validated as much as possible by the database engine:
values are restricted to "in ('Y',
'N')
make them not nullable - this can
prevent problems when you implicitly
assume that anything that is not 'Y'
has a value of 'N' in your SQL
如果您的 DBMS 支持布尔数据类型(例如 MySQL),请使用它。如果没有,像 Oracle 一样,我通常使用值为 Y 或 N 的 char(1)。在后一种情况下,最好编写几个函数来将 Java 或 C++ 或任何布尔类型转换为并且从 Y/N 开始,这样您就可以避免使用冗余代码来执行此操作。这是一个非常简单的函数,但它必须处理空值或 Y 或 N 以外的值等情况,并且您希望始终如一地执行此操作。
我绝对不会将标志打包到具有位操作的单个变量中。是的,这会节省一些磁盘空间,但代价是复杂性和出错的机会更大。如果你的 DBMS 不支持位操作——而且我从来没有想过做这样的事情,我根本不知道哪一个(如果有的话)支持——那么你就会有基于这样的标志进行选择或排序确实很困难。当然,您可以检索满足其他条件的所有记录,然后让调用代码清除具有正确标志值的记录。但是,如果只有一小部分记录具有所需的标志值,并且您有一个连接许多其他记录的查询,该怎么办?就像“使用(employee_id)从员工加入工资中选择employee.name,sum(pay.amount),其中employee.executive = true和pay.bonus = true”。使用 where 子句,您可能会检索非常少量的记录。如果没有它,您将检索整个数据库。
如今磁盘空间很便宜,因此节省任何磁盘空间可能都不重要。如果你真的有大量的标志——比如每条记录有数百或数千个——我想可能有必要打包它们。但这将是我的设计选择列表中的一个。
编辑:让我详细说明如何编写一个类来将“SQL 布尔值”转换为“Java 布尔值”。这同样适用于任何语言,但我将使用 Java 作为示例。
如果您的 DBMS 有内置布尔类型,那么使用 Java,您可以使用 ResultSet.getBoolean() 直接将其读入布尔变量。
但是,如果您必须将其存储为字符“Y”或“N”,那么您必须将其读入字符串。因此,对我来说,声明这样的类是有意义的:
然后,您可以使用“MyBoolean flag=new MyBoolean(rs.getString("flag"));”轻松地从数据库中获取布尔值。并使用“rs.setString(“flag”, flag.toString());”写入数据库
当然,如果您需要执行其他布尔操作,则可以将所需的任何其他逻辑添加到类中。如果出于某些目的,您想要将布尔值显示为 T/F 或 Yes/No 或 On/Off 等,您可以添加备用 toString 变体 -- toTFString 或 toString(value,truetext,falsetext) 或其他 -- 而不是编写类似的代码一遍又一遍。
If your DBMS supports a boolean data type, like MySQL, use it. If it doesn't, like Oracle, I typically use a char(1) with values of Y or N. In the latter case, it's a good idea to write a couple of functions to convert your Java or C++ or whatever boolean type to and from Y/N so you avoid having redundant code to do this. It's a pretty trivial function, but it will have to deal with cases like nulls or values other than Y or N and you want to do that consistently.
I would definitely NOT pack flags into a single variable with bit operations. Yes, this will save some disk space, but the price is much greater complexity and opportunities for error. If your DBMS doesn't support bit operations -- and as I've never had a desire to do such a thing, I don't know off the top of my head which, if any, do -- then you'll have a real hard time selecting or sorting based on such a flag. Sure, you could retrieve all records meeting other criteria and then have the calling code weed out the ones with the proper flag value. But what if only a small percentage of the records have the desired flag value, and you have a query that joins many other records? Like "select employee.name, sum(pay.amount) from employee join pay using (employee_id) where employee.executive=true and pay.bonus=true". With the where clause, you probably retrieve a very modest number of records. Without it, you retrieve the entire database.
Disk space is cheap these days, so any disk savings would likely be unimportant. If you really have a huge number of flags -- like hundreds or thousands of them per record -- I suppose there might be a case for packing them. But that would be way down my list of design choices.
Edit: Let me elaborate on writing a class to convert your "SQL boolean" to a "Java boolean". The same applies to any language, but I'll use Java as the example.
If your DBMS has a built-in boolean type, then with Java you can read this directly into a boolean variable with ResultSet.getBoolean().
But if you have to store it as, say, a char "Y" or "N", then you have to read it into a string. So it makes sense to me to declare a class like this:
Then you can easily pick up booleans from the database with "MyBoolean flag=new MyBoolean(rs.getString("flag"));" and write to the database with "rs.setString("flag", flag.toString());"
And of course you can then add whatever other logic you need to the class if you have other boolean stuff you need to do. If for some purposes you want to display booleans as T/F or Yes/No or On/Off or whatever, you can just add alternate toString variants -- toTFString or toString(value,truetext,falsetext) or whatever -- rather than writing similar code over and over.
我建议您创建另一个表,而不是添加列。听我说...
假设您有一个名为
Customer
的表:现在,假设您想要指示是否允许客户显示在搜索结果中。一种选择是添加一些代表两种可能状态之一的列:
您的搜索查询将如下所示:
这很好而且简单。这个线程上的许多人都给出了关于选择该列应该采用哪种数据类型的很好的建议,以便使语法在各种数据库中很好地发挥作用。
替代方案:创建一个单独的表
我将创建一个单独的表来存储每个可搜索客户的
CustomerID
,而不是向Customer
添加另一列。在这种情况下,如果客户的
CustomerID
存在于SearchableCustomer
表中,则该客户被视为可搜索。搜索客户的查询现在变为:您将看到该策略在 RDBMS 之间非常可移植:
IN
子句或JOIN
INSERT
语句DELETE
语句使客户不可搜索 意外的好处
如果您使 < code>SearchableCustomer 视图而不是表:
您的搜索查询根本不会改变! :D 根据我的经验,这带来了巨大的灵活性。
Instead of adding a column, I suggest you create another table. Hear me out...
Suppose you have a table named
Customer
:Now, suppose you want to indicate whether or not a customer is allowed to show up in search results. One option is to add some column that represents one of two possible states:
Your search query will look something like this:
This is nice and simple. Many people on this thread are giving good advice for picking which data type this column should be in order to make the syntax play nicely across various databases.
Alternative: Creating a separate table
Instead of adding another column to
Customer
, I will create a separate table that stores theCustomerID
of every searchable customer.In this case, a customer is considered searchable if their
CustomerID
exists in theSearchableCustomer
table. The query for searching for customers now becomes:You will see that this strategy is very transferable across RDBMSs:
IN
clause or aJOIN
INSERT
statementDELETE
statementA Surprise Benefit
You are free to make the definition of a searchable customer as elaborate as you want if you make
SearchableCustomer
a view instead of a table:Your search query doesn't change at all! :D In my experience, this has led to tremendous flexibility.
位列通常用于表示 T/F 或 Y/N 类型值,至少在 SQL Server 中是这样。尽管数据库纯粹主义者可能会告诉您,位列在数据库中没有位置,因为它们“太接近硬件” - Joe Celko。
Bit columns are typically used to represent T/F or Y/N type values, at least in SQL Server. Although a database purist might tell you that Bit columns have no place in databases because they are "too close to the hardware" - Joe Celko.
“选择 *
来自我的表
WHERE col1
,这在 SQL Server 和 Oracle 中是不可能的(你需要在那里有某种类型或谓词)。”
这只会表明 Oracle 和 SQL Server 确实是多么荒唐可笑的可憎行为。
如果 col1 被声明为类型 BOOLEAN,则表达式“col1”IS
如果 WHERE 子句的语义要求其表达式仅求值为真值,并且某些列被声明为“类型”。真值”,那么“该列在哪里”应该被允许和支持。时期。任何不只是暴露其作者无能的平庸庸医的系统。
"SELECT *
FROM mytable
WHERE col1
, which is impossible in SQL Server and Oracle (you need to have some kind or a predicate there)."
Which goes only to show what a ludicrous and laughable abomination Oracle and SQL server really are.
If col1 is declared to be of type BOOLEAN, then the expression "col1" IS a predicate.
If the semantics of the WHERE clause require that its expression just evaluates to a truth value, and if some column is declared to be of the type "truth value", then "WHERE that-column" should be both allowed and supported. Period. Any system that doesn't just exposes its authors for the incompetent mediocre quacks that they are.
我通常会在没有 BIT/BOOLEAN 值的情况下执行此操作。相反,我有三张桌子。假设我们有一个包含项目的项目管理系统,并且这些项目具有一大堆属性。
然后我们得到了表格:
项目的属性是 true 还是 false 取决于 ProjectAttribute_Rel 中是否有对应的行。
通常,您会在代码中处理 Attribute_ID,因此当您读取项目的属性(您可能有 Project_ID)时,您只需执行以下操作(任意使用 PHP 作为示例):
此时,您可以检查通过检查 $arrAttributes 中是否存在该项目的属性是否为 true。在 PHP 中,这将是:
这种方法还允许您执行各种特技,以避免在更新时、选择时(因为 SELECT * 在代码中不好)、插入时等时列出过多的属性。这是因为您始终可以循环遍历表属性来查找可用的属性,因此如果您添加一个属性并以这种方式执行操作,则添加/编辑/删除属性是微不足道的。您的 SQL 很可能甚至不需要更改,因为属性本身是在数据库中定义的,而不是在代码中定义的。
希望这有帮助。
I would typically do this without BIT/BOOLEAN values at all. Instead I'd have three tables. Let's say we have a project management system which has projects, and these projects have a whole bunch of attributes.
Then we have the tables:
Whether a project's attribute is true or false depends on whether there's a line for it in ProjectAttribute_Rel.
Typically, you'd be dealing with Attribute_IDs in your code, so when you read the attributes of the project (where you presumably have the Project_ID), you just do (PHP arbitrarily used as an example):
At this point, you can check whether the project's attribute is true by checking whether it exists in $arrAttributes at all. In PHP, this would be:
This approach also allows you to do all sorts stunts to avoid listing a plethora of attributes when you update, again when you select (because SELECT * is bad in code), when you insert and so forth. This is because you can always loop through the table Attribute to find the available attributes, so if you add one and you do things this way, adding/editing/deleting attributes is trivial. Odds are your SQL won't even need to be changed, because the attributes themselves are defined in database, not in code.
Hope this helps.