我应该制作另一个表还是只使用数组? (标准化或不标准化)

发布于 2024-08-09 02:09:47 字数 464 浏览 7 评论 0原文

目前的情况是主题按 3 个主要类别进行排序。有可能添加不止 3 个类别,但高层希望实现向主题添加不止 1 个类别的能力。

我最初的数据库设计将类别ID 作为主题信息表中的外键。从一开始这可能是一个坏主意,但我认为他们设置为只有 3 个类别,这样做可以减少查询。

因此,据我所知,我现在有两个选择: 1) 输入我在 php 端解析的逗号分隔字符串的类别ID。 2)重构DB,将categoryID拉出到自己的categoryID和topicID表中。

我想知道大家对此有何看法。我的第一直觉是重组数据库。但当我想到它时,第一个选项是最容易实现的,并且最不可能通过更改数据库来破坏现有的东西。然而,这也可能导致非规范化并导致数据不一致的可能性。

我读过,只要您接受数据不一致以换取性能的风险,反规范化就可以。您认为我会因为这种风险而在绩效方面获得很大收益吗?任何有关我在这种情况下应该做什么的意见将不胜感激。

感谢您的帮助,
列维

The current situation is that topics are sorted by 3 main categories. There is the potential to add more than just the 3 categories but the higher ups wish to implement the ability to add more than just 1 category to a topic.

My original db design has the the categoryID as a Foreign Key in the topic info table. Which was probably a bad idea from the start but I figured they were set on only having 3 categories and doing it this way would allow for fewer queries.

So from what I can see I have two options now:
1) Input the categoryID as a comma separated string that I parse on the php end.
2) Restructure the DB and pull out the categoryID into its own table of categoryID and topicID.

I was wondering what everyone thought of this. My first instinct was to restructure the database. But the first option when I think about it is the easiest to implement and least likely to break something existing by changing the db around. This can also lead to de-normalization however and open up the possibility of inconsistent data.

I have read the de-normalizing is fine so long as you accept the risk of having inconsistent data in exchange for performance. In your opinion will I gain much in performance for this risk? Any input on what I should do in this situation would be appreciated.

Thanks for the help,
Levi

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

水溶 2024-08-16 02:09:47

正如您所说,您最好的选择是拥有一个类别 ID-主题 ID 对的数据库,以查找主题所属的类别。

您可以通过分解categoryID中的字符串来以另一种方式做到这一点,但是当您搜索某个类别中的任何主题时,您必须遍历每个字段并对其运行LIKE...更加耗费资源。

花时间重组数据库,您最终会得到更好的结果。

Your best option would be to have a database, like you said, of categoryID-topicID pairs to find which categories the topics belong to.

You COULD do it the other way by exploding the strings in categoryID, but when you search for any topics that are in a certain category, you'll have to run through each field and run a LIKE on it... Much more resource intensive.

Take the time to restructure the DB and you'll end up with a much better result.

十雾 2024-08-16 02:09:47

如果您需要在 DBMS 中对单个项目执行某些操作,请不要以列表形式存储它们。当你的表变大时,这会让你的查询像狗一样运行。当然,如果您只想将列表视为一个单元,那么以这种方式存储它们是可以的。

但您最好确保始终将列表视为一个单元,而不是作弊,说它们是一个单元,然后将它们拆分到其他地方 - 最好让 DBMS 为您做这件事。

您应该始终首先执行 3NF,然后当且仅当您遇到性能问题时,才进行非规范化以提高速度。

您在问题中讨论的那些字段不是您将视为一个单元的类型。您需要对列表中的各个元素进行操作,因此应将它们分解到另一个表中。

If you need to do something in the DBMS with the individual items, do not store them in list form. This will make your queries run like a dog as your tables get bigger. Of course, if you're only ever going to treat the list as a unit, it's okay to store them that way.

But you'd better be sure you're going to always treat the list as a unit, and no cheating, saying they're a unit and then spiltting them apart somewhere else - better to let the DBMS do that for you.

You should always do 3NF first then if, and only if, you have performance problems, denormalize for speed.

Those fields you're talking about in the question are not the sort that you will be treating as a unit. You will need to do things to the individual elements in the lists, so they should be broken out into another table.

浸婚纱 2024-08-16 02:09:47

不要将非规范化(一个很好的例子是将 SO 问题的投票数与问题放在一起,而不是每次从“投票”表中计算)与令人厌恶的以逗号分隔的 ID 列表相混淆。

建立适当的多对多关系模型;使用逗号分隔的方法可能(并且将会)出现很多问题。仅举几例:

  1. 没有参照完整性。
  2. 几乎不可能在连接中使用。
  3. 无法充分索引;不可扩展。

Do not confuse denormalization (a good example of which is keeping the number of votes on SO question together with question as opposed to calculating it each time from 'votes' table) with the abomination that is comma-separated list of ids.

Model a proper many-to-many relationship; there are just so many things that can (and will) go wrong with comma-separated approach. To name a few:

  1. No referential integrity.
  2. Next to impossible to use in joins.
  3. Impossible to adequately index; non-scalable.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文