在设计数据库时,存储多个真/假值的首选方式是什么?

发布于 2024-08-20 10:08:51 字数 135 浏览 8 评论 0原文

如标题中所述,在设计数据库时,处理具有多个列的表的首选方法是什么,这些列仅将真/假值存储为单个或值(例如“Y/N:或“0/1”) )?同样,不同数据库(例如 Oracle 和 SQL Server)之间是否可能出现一些问题,这些问题可能会影响列的处理方式?

As stated in the title, when designing databases, what is the preferred way to handle tables that have multiple columns that are just storing true / false values as just a single either or value (e.g. "Y/N: or "0/1")? Likewise, are there some issues that might arise between different databases (e.g. Oracle and SQL Server) that might affect how the columns are handled?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(10

倾城月光淡如水﹏ 2024-08-27 10:08:51

SQL Server中,有BIT数据类型。您可以在那里存储 0 或 1,比较这些值,但不运行 MINMAX

Oracle 中,您只需使用 NUMBERCHAR(1)

MySQLPostgreSQL 中,任何数据类型都可以隐式转换为 BOOLEAN

两个系统都支持 BOOLEAN 数据类型,您可以在 WHEREON 子句中按原样使用该数据类型,无需运算符:

SELECT  *
FROM    mytable
WHERE   col1

,这在 WHEREON 子句中是不可能的code>SQL Server 和 Oracle(您需要在那里有某种类型或谓词)。

MySQL 中,BOOLEANTINYINT(1) 的同义词。

在 PostgreSQL 中也是如此(就存储而言),但从逻辑上讲,它不能隐式转换为任何其他类型。

In SQL Server, there is BIT datatype. You can store 0 or 1 there, compare the values but not run MIN or MAX.

In Oracle, you just use NUMBER or CHAR(1).

In MySQL and PostgreSQL any datatype is implicitly convertible to BOOLEAN.

Both systems support BOOLEAN datatype which you can use as is, without the operators, in the WHERE or ON clauses:

SELECT  *
FROM    mytable
WHERE   col1

, which is impossible in SQL Server and Oracle (you need to have some kind or a predicate there).

In MySQL, BOOLEAN is a synonym for TINYINT(1).

In PostgreSQL too (in terms of storage), but logically, it's not implicitly convertible to any other type.

止于盛夏 2024-08-27 10:08:51

根据我自己的经验,我更喜欢用 char(1) 来表示“Y”或“N”。使用 0 和 1 可能会有点混乱,具体取决于我已经喝了多少啤酒,并且 C++ main() 函数成功时返回 0。 ENUM 和 BIT 类型带来的麻烦超过了它们的价值。

有趣的是,MySQL information_schema 使用 VARCHAR(3) 表示“YES”或“NO”。

例子:

information_schema.USER_PRIVILEGES (
  ...
  IS_GRANTABLE VARCHAR(3) NOT NULL DEFAULT ''
) 

From my own experience, I prefer char(1) for 'Y' or 'N'. Using 0 and 1 can be a bit confusing depending on how many beers I already drank and C++ main() function return 0 on success. ENUM and BIT types are more trouble than they are worth.

It is interesting to note that MySQL information_schema uses VARCHAR(3) for 'YES' or 'NO'.

Example:

information_schema.USER_PRIVILEGES (
  ...
  IS_GRANTABLE VARCHAR(3) NOT NULL DEFAULT ''
) 
诺曦 2024-08-27 10:08:51

您可能需要考虑另一种数据模型来存储布尔值,而不是布尔数据类型,这在以下情况下可能特别合适:

  • 当您将拥有许多是/否列时。
  • 当您将来可能需要添加更多是/否列时。
  • 当是/否值不经常改变时。

定义用户权限可能是上述的典型示例。请考虑下表:

Table "Users":             (user_id, name, surname, country)

Table "Permissions":       (permission_id, permission_text)

Table "Users_Permissions": (user_id, permission_id)

Permissions 表中,您将定义可能适用于用户的所有可能权限。您必须为每个是/否属性在Permissions 表中添加一行。您可能已经注意到,这使得将来添加新权限变得非常容易,而无需修改数据库架构。

对于上述模型,您可以通过在 Users_Permissions 表中分配 user_idpermission_id 来指示 TRUE 值。否则默认为 FALSE。

例如:

Table "Permissions"

permission_id   text
-----------------------------------
1               "Read Questions"
2               "Answer Questions"
3               "Edit Questions"
4               "Close Questions"


Table "Users_Permissions"

user_id         permission_id
-----------------------------------
1               1
1               2
1               3
2               1
2               3

优点

  • 索引:您可以轻松地使用索引来查询特定事实。
  • 空间:当您有许多错误值时,默认约定可以节省空间。
  • 标准化:事实在其自己的表中定义(在PermissionsUsers_Permissions 表中。)您可以轻松存储有关每个事实的更多信息。

缺点

  • 查询:简单查询需要 JOIN。
  • 设置为 False:要将值设置为 false,您必须删除一行(从 Users_Permissions 表中)。否则,您可以在Users_Permissions 表,还允许您存储审计跟踪信息,例如权限何时修改以及由谁修改。如果删除该行,您将无法存储此信息。

Instead of boolean datatypes, you may want to consider another data model for storing boolean values, which may be particularly appropriate in the following cases:

  • when you will be having many yes/no columns.
  • when you probably require adding more yes/no columns in the future.
  • when the yes/no values do not change very frequently.

Defining user permissions may be a typical example of the above. Consider the following tables:

Table "Users":             (user_id, name, surname, country)

Table "Permissions":       (permission_id, permission_text)

Table "Users_Permissions": (user_id, permission_id)

In the Permissions table you would define all the possible permissions that may be applicable to users. You would have to add a row to the Permissions table for each yes/no attribute. As you may have noticed, this makes it very easy to add new permissions in the future without having to modify the database schema.

With the above model, you would then indicate a TRUE value by assigning a user_id with a permission_id in the Users_Permissions table. Otherwise it would be FALSE by default.

For example:

Table "Permissions"

permission_id   text
-----------------------------------
1               "Read Questions"
2               "Answer Questions"
3               "Edit Questions"
4               "Close Questions"


Table "Users_Permissions"

user_id         permission_id
-----------------------------------
1               1
1               2
1               3
2               1
2               3

Advantages

  • Indexing: you can easily use an index on to query for specific facts.
  • Space: The default convention saves space when you have many false values.
  • Normalized: Facts are defined in their own tables (in the Permissions and Users_Permissions tables.) You can easily store more information on each fact.

Disadvantages

  • Queries: Simple queries would require JOINs.
  • Setting to False: To set a value to false, you would have to delete a row (from the Users_Permissions table.) Otherwise you can use a 'deleted' flag in the Users_Permissions table, which would also allow you to store information for audit trails, such as when a permission was modified and by whom. If you delete the row, you would not be able to store this information.
晒暮凉 2024-08-27 10:08:51

使用对您正在使用的特定数据库引擎有意义的任何内容。这是需要处理它的数据库接口。如果数据库的代码端接口足够模块化,那么只需进行简单的一行更改即可处理底层数据库中的不同布尔类型。

Use whatever makes sense for the specific database engine you're using. It's the interface to the database that needs to handle it. If the code-side interface to the database is sufficiently modularized, then it will be nothing more than a simple one-line change to handle a different boolean type in the underlying database.

幸福不弃 2024-08-27 10:08:51

我认为“Y/N”值比“1/0”更有意义。对于 Oracle,我将执行以下操作,以便数据库引擎尽可能多地验证数据:

  • 将列定义为 char(1)
  • 添加可能的检查约束
    值限制为“in ('Y',
    'N')
  • 如果符合业务规则,
    使它们不可为空 - 这可以
    当你隐式地预防问题时
    假设任何不是“Y”的东西
    SQL 中的值为“N”

I think "Y/N" values are more meaningful than "1/0". With Oracle, I would do the following to have the data validated as much as possible by the database engine:

  • define the columns as char(1)
  • add a check constraint that possible
    values are restricted to "in ('Y',
    'N')
  • if consistent with business rules,
    make them not nullable - this can
    prevent problems when you implicitly
    assume that anything that is not 'Y'
    has a value of 'N' in your SQL
情绪失控 2024-08-27 10:08:51

如果您的 DBMS 支持布尔数据类型(例如 MySQL),请使用它。如果没有,像 Oracle 一样,我通常使用值为 Y 或 N 的 char(1)。在后一种情况下,最好编写几个函数来将 Java 或 C++ 或任何布尔类型转换为并且从 Y/N 开始,这样您就可以避免使用冗余代码来执行此操作。这是一个非常简单的函数,但它必须处理空值或 Y 或 N 以外的值等情况,并且您希望始终如一地执行此操作。

我绝对不会将标志打包到具有位操作的单个变量中。是的,这会节省一些磁盘空间,但代价是复杂性和出错的机会更大。如果你的 DBMS 不支持位操作——而且我从来没有想过做这样的事情,我根本不知道哪一个(如果有的话)支持——那么你就会有基于这样的标志进行选择或排序确实很困难。当然,您可以检索满足其他条件的所有记录,然后让调用代码清除具有正确标志值的记录。但是,如果只有一小部分记录具有所需的标志值,并且您有一个连接许多其他记录的查询,该怎么办?就像“使用(employee_id)从员工加入工资中选择employee.name,sum(pay.amount),其中employee.executive = true和pay.bonus = true”。使用 where 子句,您可能会检索非常少量的记录。如果没有它,您将检索整个数据库。

如今磁盘空间很便宜,因此节省任何磁盘空间可能都不重要。如果你真的有大量的标志——比如每条记录有数百或数千个——我想可能有必要打包它们。但这将是我的设计选择列表中的一个。

编辑:让我详细说明如何编写一个类来将“SQL 布尔值”转换为“Java 布尔值”。这同样适用于任何语言,但我将使用 Java 作为示例。

如果您的 DBMS 有内置布尔类型,那么使用 Java,您可以使用 ResultSet.getBoolean() 直接将其读入布尔变量。

但是,如果您必须将其存储为字符“Y”或“N”,那么您必须将其读入字符串。因此,对我来说,声明这样的类是有意义的:

class MyBoolean
{
  boolean value;
  final static MyBoolean TRUE=new MyBoolean(true), FALSE=new MyBoolean(false);
  public MyBoolean(boolean b)
  {
    value=b;
  }
  public MyBoolean(String s)
  {
    if (s==null)
      return null;
    else if (s.equals("Y"))
      return MyBoolean.TRUE;
    else
      return MyBoolean.FALSE;
  }
  public static String toString(MyBoolean b)
  {
    if (b==null)
      return null;
    else if (b.value)
      return "Y";
    else
      reutrn "N";
  }
  public String toString()
  {
    return toString(this);
  }
}

然后,您可以使用“MyBoolean flag=new MyBoolean(rs.getString("flag"));”轻松地从数据库中获取布尔值。并使用“rs.setString(“flag”, flag.toString());”写入数据库

当然,如果您需要执行其他布尔操作,则可以将所需的任何其他逻辑添加到类中。如果出于某些目的,您想要将布尔值显示为 T/F 或 Yes/No 或 On/Off 等,您可以添加备用 toString 变体 -- toTFString 或 toString(value,truetext,falsetext) 或其他 -- 而不是编写类似的代码一遍又一遍。

If your DBMS supports a boolean data type, like MySQL, use it. If it doesn't, like Oracle, I typically use a char(1) with values of Y or N. In the latter case, it's a good idea to write a couple of functions to convert your Java or C++ or whatever boolean type to and from Y/N so you avoid having redundant code to do this. It's a pretty trivial function, but it will have to deal with cases like nulls or values other than Y or N and you want to do that consistently.

I would definitely NOT pack flags into a single variable with bit operations. Yes, this will save some disk space, but the price is much greater complexity and opportunities for error. If your DBMS doesn't support bit operations -- and as I've never had a desire to do such a thing, I don't know off the top of my head which, if any, do -- then you'll have a real hard time selecting or sorting based on such a flag. Sure, you could retrieve all records meeting other criteria and then have the calling code weed out the ones with the proper flag value. But what if only a small percentage of the records have the desired flag value, and you have a query that joins many other records? Like "select employee.name, sum(pay.amount) from employee join pay using (employee_id) where employee.executive=true and pay.bonus=true". With the where clause, you probably retrieve a very modest number of records. Without it, you retrieve the entire database.

Disk space is cheap these days, so any disk savings would likely be unimportant. If you really have a huge number of flags -- like hundreds or thousands of them per record -- I suppose there might be a case for packing them. But that would be way down my list of design choices.

Edit: Let me elaborate on writing a class to convert your "SQL boolean" to a "Java boolean". The same applies to any language, but I'll use Java as the example.

If your DBMS has a built-in boolean type, then with Java you can read this directly into a boolean variable with ResultSet.getBoolean().

But if you have to store it as, say, a char "Y" or "N", then you have to read it into a string. So it makes sense to me to declare a class like this:

class MyBoolean
{
  boolean value;
  final static MyBoolean TRUE=new MyBoolean(true), FALSE=new MyBoolean(false);
  public MyBoolean(boolean b)
  {
    value=b;
  }
  public MyBoolean(String s)
  {
    if (s==null)
      return null;
    else if (s.equals("Y"))
      return MyBoolean.TRUE;
    else
      return MyBoolean.FALSE;
  }
  public static String toString(MyBoolean b)
  {
    if (b==null)
      return null;
    else if (b.value)
      return "Y";
    else
      reutrn "N";
  }
  public String toString()
  {
    return toString(this);
  }
}

Then you can easily pick up booleans from the database with "MyBoolean flag=new MyBoolean(rs.getString("flag"));" and write to the database with "rs.setString("flag", flag.toString());"

And of course you can then add whatever other logic you need to the class if you have other boolean stuff you need to do. If for some purposes you want to display booleans as T/F or Yes/No or On/Off or whatever, you can just add alternate toString variants -- toTFString or toString(value,truetext,falsetext) or whatever -- rather than writing similar code over and over.

美男兮 2024-08-27 10:08:51

我建议您创建另一个表,而不是添加列。听我说...

假设您有一个名为 Customer 的表:

CREATE TABLE Customer
(
  CustomerID NUMBER,
  Name       VARCHAR(100)
)

现在,假设您想要指示是否允许客户显示在搜索结果中。一种选择是添加一些代表两种可能状态之一的列:

CREATE TABLE Customer
(
  CustomerID NUMBER,
  Name       VARCHAR(100),
  Searchable BOOLEAN /* or CHAR(1) or BIT... */
)

您的搜索查询将如下所示:

SELECT CustomerID, Name
  FROM Customer
 WHERE Name LIKE '%TheCustomerNameIAmLookingFor%'
   AND Searchable = TRUE /* or 'Y' or 0... */

这很好而且简单。这个线程上的许多人都给出了关于选择该列应该采用哪种数据类型的很好的建议,以便使语法在各种数据库中很好地发挥作用。

替代方案:创建一个单独的表

我将创建一个单独的表来存储每个可搜索客户的 CustomerID,而不是向 Customer 添加另一列。

CREATE TABLE Customer
(
  CustomerID NUMBER,
  Name       VARCHAR(100)
)

CREATE TABLE SearchableCustomer
(
  CustomerID NUMBER
)

在这种情况下,如果客户的 CustomerID 存在于 SearchableCustomer 表中,则该客户被视为可搜索。搜索客户的查询现在变为:

SELECT CustomerID, Name
  FROM Customer
 WHERE Name LIKE '%TheCustomerNameIAmLookingFor%'
   AND CustomerID IN (SELECT CustomerID FROM SearchableCustomer)

您将看到该策略在 RDBMS 之间非常可移植:

  • 查找可搜索客户使用 IN 子句或 JOIN
  • 使客户可搜索使用INSERT 语句
  • 使用 DELETE 语句使客户不​​可搜索 意外的

好处

如果您使 < code>SearchableCustomer 视图而不是表

CREATE VIEW SearchableCustomer AS
SELECT CustomerID
  FROM Customer
 WHERE Name LIKE 'S%' /* For some reason, management only cares about customers whose name starts with 'S' */

您的搜索查询根本不会改变! :D 根据我的经验,这带来了巨大的灵活性。

Instead of adding a column, I suggest you create another table. Hear me out...

Suppose you have a table named Customer:

CREATE TABLE Customer
(
  CustomerID NUMBER,
  Name       VARCHAR(100)
)

Now, suppose you want to indicate whether or not a customer is allowed to show up in search results. One option is to add some column that represents one of two possible states:

CREATE TABLE Customer
(
  CustomerID NUMBER,
  Name       VARCHAR(100),
  Searchable BOOLEAN /* or CHAR(1) or BIT... */
)

Your search query will look something like this:

SELECT CustomerID, Name
  FROM Customer
 WHERE Name LIKE '%TheCustomerNameIAmLookingFor%'
   AND Searchable = TRUE /* or 'Y' or 0... */

This is nice and simple. Many people on this thread are giving good advice for picking which data type this column should be in order to make the syntax play nicely across various databases.

Alternative: Creating a separate table

Instead of adding another column to Customer, I will create a separate table that stores the CustomerID of every searchable customer.

CREATE TABLE Customer
(
  CustomerID NUMBER,
  Name       VARCHAR(100)
)

CREATE TABLE SearchableCustomer
(
  CustomerID NUMBER
)

In this case, a customer is considered searchable if their CustomerID exists in the SearchableCustomer table. The query for searching for customers now becomes:

SELECT CustomerID, Name
  FROM Customer
 WHERE Name LIKE '%TheCustomerNameIAmLookingFor%'
   AND CustomerID IN (SELECT CustomerID FROM SearchableCustomer)

You will see that this strategy is very transferable across RDBMSs:

  • Finding searchable customers uses the IN clause or a JOIN
  • Making a customer searchable uses the INSERT statement
  • Making a customer non-searchable uses the DELETE statement

A Surprise Benefit

You are free to make the definition of a searchable customer as elaborate as you want if you make SearchableCustomer a view instead of a table:

CREATE VIEW SearchableCustomer AS
SELECT CustomerID
  FROM Customer
 WHERE Name LIKE 'S%' /* For some reason, management only cares about customers whose name starts with 'S' */

Your search query doesn't change at all! :D In my experience, this has led to tremendous flexibility.

清风挽心 2024-08-27 10:08:51

位列通常用于表示 T/F 或 Y/N 类型值,至少在 SQL Server 中是这样。尽管数据库纯粹主义者可能会告诉您,位列在数据库中没有位置,因为它们“太接近硬件” - Joe Celko。

Bit columns are typically used to represent T/F or Y/N type values, at least in SQL Server. Although a database purist might tell you that Bit columns have no place in databases because they are "too close to the hardware" - Joe Celko.

不念旧人 2024-08-27 10:08:51

“选择 *
来自我的表
WHERE col1

,这在 SQL Server 和 Oracle 中是不可能的(你需要在那里有某种类型或谓词)。”

这只会表明 Oracle 和 SQL Server 确实是多么荒唐可笑的可憎行为。

如果 col1 被声明为类型 BOOLEAN,则表达式“col1”IS

如果 WHERE 子句的语义要求其表达式仅求值为真值,并且某些列被声明为“类型”。真值”,那么“该列在哪里”应该被允许和支持。时期。任何不只是暴露其作者无能的平庸庸医的系统。

"SELECT *
FROM mytable
WHERE col1

, which is impossible in SQL Server and Oracle (you need to have some kind or a predicate there)."

Which goes only to show what a ludicrous and laughable abomination Oracle and SQL server really are.

If col1 is declared to be of type BOOLEAN, then the expression "col1" IS a predicate.

If the semantics of the WHERE clause require that its expression just evaluates to a truth value, and if some column is declared to be of the type "truth value", then "WHERE that-column" should be both allowed and supported. Period. Any system that doesn't just exposes its authors for the incompetent mediocre quacks that they are.

多孤肩上扛 2024-08-27 10:08:51

我通常会在没有 BIT/BOOLEAN 值的情况下执行此操作。相反,我有三张桌子。假设我们有一个包含项目的项目管理系统,并且这些项目具有一大堆属性。

然后我们得到了表格:

Project
 - Project_ID (INT),
 - Name (VARCHAR)

Attribute
 - Attribute_ID (INT),
 - Name (VARCHAR)

ProjectAttribute_Rel
 - Project_ID (INT),
 - Attribute_ID (INT)

项目的属性是 true 还是 false 取决于 ProjectAttribute_Rel 中是否有对应的行。

通常,您会在代码中处理 Attribute_ID,因此当您读取项目的属性(您可能有 Project_ID)时,您只需执行以下操作(任意使用 PHP 作为示例):

$arrAttributes = array();
$oQuery = mysql_query('
    SELECT Attribute_ID
    FROM ProjectAttribute_Rel
    WHERE Project_ID = '.addslashes($iProjectId).'
');
while ($rowAttribute = mysql_fetch_assoc($oQuery)) {
    $arrAttributes[] = $rowAttribute['Attribute_ID'];
}

此时,您可以检查通过检查 $arrAttributes 中是否存在该项目的属性是否为 true。在 PHP 中,这将是:

if (in_array($arrAttributes, $iAttributeId)) {
    // Project attribute is true!
}

这种方法还允许您执行各种特技,以避免在更新时、选择时(因为 SELECT * 在代码中不好)、插入时等时列出过多的属性。这是因为您始终可以循环遍历表属性来查找可用的属性,因此如果您添加一个属性并以这种方式执行操作,则添加/编辑/删除属性是微不足道的。您的 SQL 很可能甚至不需要更改,因为属性本​​身是在数据库中定义的,而不是在代码中定义的。

希望这有帮助。

I would typically do this without BIT/BOOLEAN values at all. Instead I'd have three tables. Let's say we have a project management system which has projects, and these projects have a whole bunch of attributes.

Then we have the tables:

Project
 - Project_ID (INT),
 - Name (VARCHAR)

Attribute
 - Attribute_ID (INT),
 - Name (VARCHAR)

ProjectAttribute_Rel
 - Project_ID (INT),
 - Attribute_ID (INT)

Whether a project's attribute is true or false depends on whether there's a line for it in ProjectAttribute_Rel.

Typically, you'd be dealing with Attribute_IDs in your code, so when you read the attributes of the project (where you presumably have the Project_ID), you just do (PHP arbitrarily used as an example):

$arrAttributes = array();
$oQuery = mysql_query('
    SELECT Attribute_ID
    FROM ProjectAttribute_Rel
    WHERE Project_ID = '.addslashes($iProjectId).'
');
while ($rowAttribute = mysql_fetch_assoc($oQuery)) {
    $arrAttributes[] = $rowAttribute['Attribute_ID'];
}

At this point, you can check whether the project's attribute is true by checking whether it exists in $arrAttributes at all. In PHP, this would be:

if (in_array($arrAttributes, $iAttributeId)) {
    // Project attribute is true!
}

This approach also allows you to do all sorts stunts to avoid listing a plethora of attributes when you update, again when you select (because SELECT * is bad in code), when you insert and so forth. This is because you can always loop through the table Attribute to find the available attributes, so if you add one and you do things this way, adding/editing/deleting attributes is trivial. Odds are your SQL won't even need to be changed, because the attributes themselves are defined in database, not in code.

Hope this helps.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文