将 90% 的行的字段设置为 NULL 是否有效?

发布于 2024-11-16 21:34:39 字数 301 浏览 3 评论 0原文

我的 MySQL (InnoDB) 中有一个表,其中充满了用户项。基本上每一行都有一个 user_id 字段和其他项目属性(例如颜色)。然后还有一个称为链接的字段,它保存其他用户项目的 ID,但在大多数情况下 (90 %) 没有链接的项目,因此该字段设置为 NULL

我想知道创建一个包含链接信息的新表是否比将 600 万 行中的 90% 的字段链接设置为 空?

我正在使用休眠。

I have a table in my MySQL (InnoDB) full with user items. Basically each row has a user_id field and other item properties like color. Then there is one more field called a link which holds the id of some other user's item, but in most cases (90 %) there is no item linked and thus the field is set to NULL.

I was wondering would it be more efficient to make a new table which would hold the link information than having 90 % of 6 million rows to have the field link set to NULL?

I'm using Hibernate.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

小ぇ时光︴ 2024-11-23 21:34:39

是的,这样会更有效率。这会产生非常小的差异。

最好的办法是做对你来说最简单的事情,然后当它成为真正的问题时改变它。

Yes, it would be more efficient. It would make a very small difference.

Best is to do whatever is easiest for you and then change it when it becomes a real problem.

一场春暖 2024-11-23 21:34:39

是的,这会更高效、更规范。每当我看到像这样有大量空值的表时,我就会认为它是标准化的候选者。在此示例中,您可以从表中完全删除该列,这样会更干净且更易于维护。您只需创建一个带有两个 user_id 的联结表,这两个 user_id 是用户项表上的外键。

Yes, it would be more efficient and more normalized. Whenever I see a table with lots of nulls like this, I consider it a candidate for normalization. In this example, you could remove that column from the table entirely and it would be much more cleaner and easier to maintain. You would just create a junction table with a two user_ids that are foreign keys on the user items table.

浪漫之都 2024-11-23 21:34:39

就您的逻辑而言,只需考虑仅存储具有链接数据的表,并且调用 if 的代码可以执行“if not null”或等同于始终知道要获取什么。如果您可以做出更好的假设,请不要存储所有这些空值

as far as your logic is concerned, just consider storing only the tables with link data, and the code that calls if can do an "if not null" or equivalent to always know what to get. Don't store all those nulls if you can make better assumptions

恋你朝朝暮暮 2024-11-23 21:34:39

它将占用更少的空间。但是,如果您为每个查询执行(左)JOIN,则性能会更差......特别是如果您有很多行并且表不适合内存。那么您需要两次磁盘查找来获取一条记录。

更新

  • JOIN 需要一些额外的处理。如果您有索引,速度会很快,但您仍然必须查找另一条记录。如果您使用 InnoDB 来支持事务,则数据库必须维护连接记录的版本。
  • JOIN 对内存局部性不利,现在您必须查找位于完全不同的内存位置的记录。
  • 正如我提到的,如果数据不在内存中,则需要额外的磁盘寻道。这真的很糟糕。

It's going to occupy less space. But if you do a (left) JOIN for each query the performance is going to be worse.. especially if you have many rows and the table doesn't fit in memory. Then you need two disk seeks to fetch one record.

Update:

  • JOIN takes some additional processing. It's going to be fast if you have indexes but still, you have to look up another record. And if you use InnoDB to support transactions the database has to maintain a version for the joined record.
  • JOIN is bad for memory locality, now you have to look up a record which is at an entirely different memory location.
  • As I mentioned if the data is not in the memory you need an additional disk seek. This is really bad.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文