数据库表中的数组和规范化

发布于 2024-09-02 22:13:58 字数 1003 浏览 6 评论 0原文

将数组保留在表列中是否明智?更准确地说,我正在考虑以下模式,据我所知,该模式违反了规范化:


create table Permissions(
    GroupID int not null default(-1),
    CategoryID int not null default(-1),
    Permissions varchar(max) not null default(''),
    constraint PK_GroupCategory primary key clustered(GroupID,CategoryID)
);

以及:


create table Permissions(
    GroupID int not null default(-1),
    CategoryID int not null default(-1),
    PermissionID int not null default(-1),
    constraint PK_GroupCategory primary key clustered(GroupID,CategoryID)
);

UPD3:我将权限设想为逗号分隔的字符串,因为 MSSQL 是我们的主要部署目标。

UPD:忘记提及,在这个具体问题的范围内,我们将考虑不会执行“获取具有权限 X 的行”,而是仅通过 GroupID 和 CategoryID 进行所有查找

UPD2:我设想的典型使用场景如下: 想法


int category_id=42;
int[] array_of_groups=new int[]{40,2,42};
if(!Permissions.Check(category_id, array_of_groups, Permission.EatAndDrink)) {
    throw new StarveToDeathException();
}

提前致谢!

Is it smart to keep arrays in table columns? More precisely I am thinking of the following schema which to my understanding violates normalization:


create table Permissions(
    GroupID int not null default(-1),
    CategoryID int not null default(-1),
    Permissions varchar(max) not null default(''),
    constraint PK_GroupCategory primary key clustered(GroupID,CategoryID)
);

and this:


create table Permissions(
    GroupID int not null default(-1),
    CategoryID int not null default(-1),
    PermissionID int not null default(-1),
    constraint PK_GroupCategory primary key clustered(GroupID,CategoryID)
);

UPD3: I envision Permissions as a comma-delimited string since MSSQL is our primary deployment target.


UPD: Forgot to mention, in the scope of this concrete question we will consider that the "fetch rows that have permission X" won't be performed, instead all the lookups will be made by GroupID and CategoryID only


UPD2: I envision the typical usage scenario as following:


int category_id=42;
int[] array_of_groups=new int[]{40,2,42};
if(!Permissions.Check(category_id, array_of_groups, Permission.EatAndDrink)) {
    throw new StarveToDeathException();
}

Thoughts?

Thanks in advance!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

梦里°也失望 2024-09-09 22:13:58

我建议采取规范化的方式,原因如下:

  • 通过拥有一个包含所有可能权限的表,您就拥有了自记录数据。您可以为每个权限添加描述。这绝对击败了没有任何意义的串联 id 值。
  • 您可以获得引用完整性的所有优势,并且可以确保您的数据中不存在虚假的权限 ID。
  • 插入和删除权限将更加容易 - 您可以添加或删除记录。使用连接的字符串,您将更新列,并仅在删除最后一个权限时删除记录。
  • 您的设计是面向未来的 - 您说您只想按 CategoryID 和 GroupID 进行查询,您已经可以使用规范化表来执行此操作。除此之外,您还可以向您的权限添加其他属性、按权限查询等。
  • 性能方面,我认为获取 id 的结果集实际上会更快而不是必须将字符串解析为整数。以实际数据和实施来衡量...

I'd suggest to take the normalized road for the following reasons:

  • By having a table containing all possible permissions, you have self-documenting data. You may add a description to each permission. This definitely beats concatenated id values without any meaning.
  • You get all the advantages of referential integrity and can be sure that there are no bogus permission ids in your data.
  • Inserting and deleting permissions will be easier - you add or delete records. With the concatenated string you will be updating a column, and delete the record only when you remove the last permission.
  • Your design is future-proof - you say you only want to query by CategoryID and GroupID, you can do this already with normalized tables. On top of that, you will also for example be able to add other properties to your permissions, query by permission, etc.
  • Performance-wise, I think it will actually be faster to get a resultset of id's than having to parse a string to integers. To be measured with actual data and implementation...
放我走吧 2024-09-09 22:13:58

您的第二个示例可能应该是:

constraint PK_GroupCategory primary key clustered(GroupID,CategoryID,PermissionID)

您的第一个示例将违反正常形式(并且字符串解析可能无法很好地利用您的处理时间),但这并不意味着它对于您的应用程序来说一定是错误的。这实际上取决于您如何使用数据。

Your second example should probably be:

constraint PK_GroupCategory primary key clustered(GroupID,CategoryID,PermissionID)

Your first example would violate normal form (and string parsing might not be a good use of your processing time), but that doesn't mean it's necessarily wrong for your application. It really depends how you use the data.

吐个泡泡 2024-09-09 22:13:58

聪明吗

偶尔,这要看情况。我想说这取决于你对标准化事物的定义有多狭隘。

如果您认为每个项目一行的表格没有任何用处,那么我建议可以考虑封装在字符串中。

在给出的示例中,如果我必须编写使用字符串模式匹配的 WHERE 子句,我希望确保执行查询来查找指定权限的所有组/类别组合不会给我带来问题。当然,如果我永远不需要执行这样的查询,那么这是一个没有实际意义的问题。

一般来说,当组装的数据在孤立时没有任何意义时,我对这种方法最满意:数据只有在被视为一个完整的集合时才有意义。如果有更多的结构,例如数据/值对列表,那么使用 XML 或 JSON 进行格式化可能会很有用。

Is it smart

Occasionally, it depends. I'd say it depends how narrowly you define the things being normalised.

If you can see no way in which a table with one row for each item would ever be useful then I'd suggest that the encapsulate-in-a-string might be considered.

In the example given, I'd want to be sure that executing a query to find all group/category combinations for a specified permission would not cause me a problem if I had to write a WHERE clause that used string pattern matching. Of course, if I never have to perform such a query then it's a moot point.

In general I'm happiest with this approach when the data being assembled thus has no significance in isolation: the data only makes sense when considered as a complete set. If there's a little more structure, say a list of data/value pairs, then formatting with XML or JSON can be useful.

行雁书 2024-09-09 22:13:58

如果您仅通过 GroupID 和/或 CategoryID 进行查询,那么没有任何问题。规范化意味着更多的表、行和连接。因此,对于大型数据库来说,这可能会对性能产生负面影响。

如果您绝对确定您永远不需要处理权限的查询,并且它仅由您的应用程序解析,则此解决方案没有任何不妥之处。如果您始终想要完整的权限集(即,您不只是为了获取字符串的一部分而查询,而是始终想要其所有值),那么它也可能是更好的选择。

If you're only querying by GroupID and/or CategoryID then there's nothing wrong with it. Normalizing would mean more tables, rows, and joins. So for large databases this can have a negative performance impact.

If you're absolutely certain you'll never need a query which processes Permissions, and it's only parsed by your application, there's nothing improper about this solution. It could also be preferable if you always want the complete set of permissions (i.e. you're not querying just to get part of the string, but always want all of its values).

那伤。 2024-09-09 22:13:58

第一个实现的问题在于它实际上并不使用数组,而是使用连接的字符串。

这意味着您将无法轻松使用该字符串中存储的值来执行基于集的查询,例如查找具有特定权限或特定权限集的所有人员。

如果您使用的数据库本身支持数组作为原子值,例如 PostgreSQL,则参数会有所不同。

根据建议查询的第二个要求,我不得不建议第二个是最好的,因为您可以简单地查询 SELECT count(*) FROM Permissions WHERE CategoryID = 42 AND GroupID IN (40, 2, 42) AND PermissionID = 2(假设 EatAndDrink 的 ID 为 2)。然而,第一个版本需要检索每个组的所有权限并解析字符串,然后才能测试它是否包含所请求的权限。

The problem with the first implementation is that it doesn't actually use an array but a concatenated string.

This means that you won't easily be able to use the value stored in that string to perform set based queries such as finding all people with a specific permission or specific set of permissions.

If you were using a database that natively supported arrays as an atomic value such PostgreSQL then the argument would be different.

Based upon the second requirement of the proposed query I'd have to suggest the second one is best as you can simply query SELECT count(*) FROM Permissions WHERE CategoryID = 42 AND GroupID IN (40, 2, 42) AND PermissionID = 2 (assuming EatAndDrink has an ID of 2). The first version however would require retrieving all the permissions for each group and parsing the string before you can test if it includes the requested permission.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文