将 90% 的行的字段设置为 NULL 是否有效?
我的 MySQL (InnoDB) 中有一个表,其中充满了用户项。基本上每一行都有一个 user_id 字段和其他项目属性(例如颜色)。然后还有一个称为链接的字段,它保存其他用户项目的 ID,但在大多数情况下 (90 %) 没有链接的项目,因此该字段设置为 NULL
。
我想知道创建一个包含链接信息的新表是否比将 600 万 行中的 90% 的字段链接设置为 空?
我正在使用休眠。
I have a table in my MySQL (InnoDB) full with user items. Basically each row has a user_id field and other item properties like color. Then there is one more field called a link which holds the id of some other user's item, but in most cases (90 %) there is no item linked and thus the field is set to NULL
.
I was wondering would it be more efficient to make a new table which would hold the link information than having 90 % of 6 million rows to have the field link set to NULL
?
I'm using Hibernate.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
是的,这样会更有效率。这会产生非常小的差异。
最好的办法是做对你来说最简单的事情,然后当它成为真正的问题时改变它。
Yes, it would be more efficient. It would make a very small difference.
Best is to do whatever is easiest for you and then change it when it becomes a real problem.
是的,这会更高效、更规范。每当我看到像这样有大量空值的表时,我就会认为它是标准化的候选者。在此示例中,您可以从表中完全删除该列,这样会更干净且更易于维护。您只需创建一个带有两个 user_id 的联结表,这两个 user_id 是用户项表上的外键。
Yes, it would be more efficient and more normalized. Whenever I see a table with lots of nulls like this, I consider it a candidate for normalization. In this example, you could remove that column from the table entirely and it would be much more cleaner and easier to maintain. You would just create a junction table with a two user_ids that are foreign keys on the user items table.
就您的逻辑而言,只需考虑仅存储具有链接数据的表,并且调用 if 的代码可以执行“if not null”或等同于始终知道要获取什么。如果您可以做出更好的假设,请不要存储所有这些空值
as far as your logic is concerned, just consider storing only the tables with link data, and the code that calls if can do an "if not null" or equivalent to always know what to get. Don't store all those nulls if you can make better assumptions
它将占用更少的空间。但是,如果您为每个查询执行(左)JOIN,则性能会更差......特别是如果您有很多行并且表不适合内存。那么您需要两次磁盘查找来获取一条记录。
更新:
It's going to occupy less space. But if you do a (left) JOIN for each query the performance is going to be worse.. especially if you have many rows and the table doesn't fit in memory. Then you need two disk seeks to fetch one record.
Update: