如何设计一个只需要一列的表?
我正在创建一个数据库表,其中包含我的应用程序中可用的所有标签的列表(就像 SO 的标签一样)。
目前,我没有与每个标签相关的任何内容(而且我可能永远不会有),所以我的想法是采用以下形式的内容
Tags (Tag(pk) : string)
应该这样做吗?或者我应该做一些类似的事情,
Tags (tag_id(pk) : int, tag : string)
我猜在第二种情况下查找表格会比第一种情况更快,但它也会占用更多空间?
谢谢
I am creating a database table that'll have a list of all Tags available in my application (just like SO's tags).
Currently, I don't have anything associated with each tag (and I'll probably never have), so my idea was to have something of the form
Tags (Tag(pk) : string)
Should this be the way to do it? Or should I instead do something like
Tags (tag_id(pk) : int, tag : string)
I guess looking up on the table in the 2nd case would be faster than in the first one, but that it also takes up more space?
Thanks
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
我会选择使用代理键的第二个选项。
将标签信息作为其他表(例如 posts/tags 表)中的外键,则可能会减少所有空间
这意味着该表占用更多空间,但假设您使用
int
而不是string
将使强制执行外键所需的查找更加高效,并且意味着标签标题的更新不需要影响多个表。I'd go for the second option with the surrogate key.
It will mean the table takes up more space but will likely reduce space over all assuming that you have the tag information as a foreign key in other tables (e.g. a posts/tags table)
using an
int
rather than astring
will make the lookups required to enforce the foreign key more efficient and mean that updates of tag titles don't need to affect multiple tables.与
CHAR
/VARCHAR
相比,索引对整数的处理效果更好,并使用专用的整数主键列。如果您需要标记名称是唯一的,您可以添加约束,但这可能不值得这么麻烦。Indexes work better with integers than
CHAR
/VARCHAR
, go with a dedicated integer primary key column. If you need tag names to be unique you can add a constraint, but it's probably not worth the hassle.你应该选择第二个选项。首先,你永远不知道未来会怎样。其次,您稍后可能需要多种语言支持或其他使字符串作为主键有一种奇怪的感觉的东西。第三,我喜欢使用标准过程来定义表的想法,即。总是有一个列“id”或“pk”。它将业务与技术分开。
当索引是整数时,您很可能会进行更快的查找。此外,考虑使索引聚集以进一步加速。
不过,我不会过多强调性能问题。一旦程序开始通过互联网与数据库通信,您的延迟就会比数据库中 99% 的查询都要大得多(当然,报告查询除外!)。
You should go for the second option. Firstly, you never know what the future holds. Secondly, you may later want multiple language support or other things that makes the string-as-the-primary-key have a strange feeling around it. Thirdly, I like the idea of using a standard procedure for a table definition, ie. that there always is a column 'id' or 'pk'. It separates business from technology.
Quite possibly you'll have a faster lookup with the index being an integer. Further, consider making your index clustered for even further speedup.
I wouldn't emphasize too much on the performance issue though. As soon as a program starts talking to a database over the internet, you have a much bigger delay than 99% of all the queries of your database (of course with the exception of reporting queries!).
这两个选项实现了完全不同的效果。在第一种情况下,您有唯一的标签,而在第二种情况下,您没有。你还没有说TAG_ID在这个模型中有什么用。除非您出于充分的理由输入 TAG_ID,否则我会坚持第一个设计。它更小,似乎可以精确满足您的要求,并且标签似乎是更明显的密钥选择(基于熟悉性和简单性)。
Those two options achieve quite different things. In the first case you have unique tags and in the second you don't. You haven't said what use TAG_ID is in this model. Unless you put in TAG_ID for a good reason then I'd stick with the first design. It's smaller, appears to meet your requirements precisely and Tag seems like a more obvious choice for a key (on grounds of familiarity and simplicity).