如何决定基于行和基于列的表结构?
我有一些数据集,其中有数百个参数(还有更多参数)
- 如果我将它们转储到一个表中,它最终可能会有数百列(此时我什至不确定有多少列)
- 我可以使用一堆元表进行基于行的操作,但不知怎的,基于行的结构感觉不直观。
- 另一种方法是保持基于列,但有多个表(逻辑上拆分表),这似乎是一个很好的解决方案。
还有其他方法吗?如果是的话,你能给我一些教程吗? (我正在使用 mysql)
编辑: 根据答案,我应该澄清一件事 - 更新和删除将比插入和选择少得多。事实上,选择将成为大部分操作,因此选择必须很快。
I've some data set, which has hundreds of parameters (with more coming in)
- If I dump them in one table, it'll probably end up having hundreds of columns (and I am not even sure how many, at this point)
- I could do row based, with a bunch of meta tables, but somehow row based structure feels unintuitive
- One more way would be to keep column based, but have multiple tables (split the tables logically) which seems like a good solution.
Is there any other way to do it? If yes, could you point me to some tutorial? (I am using mysql)
EDIT:
based on the answers, I should clarify one thing - updates and deletes are going to be much lesser, than inserts and selects. as it is, selects are going to be the bulk of the operations, so selects have to be fast.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
data:image/s3,"s3://crabby-images/d5906/d59060df4059a6cc364216c4d63ceec29ef7fe66" alt="扫码二维码加入Web技术交流群"
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我遇到了几种可能采用#4 的设计:
我们将此方法用于 XML 数据的 BLOB 甚至二进制数据,代表整个序列化对象。缺点是您的辅助列对于所有实际目的仍然不可搜索。好处是您可以随意添加新的辅助列,而无需更改架构。您还可以进行架构更改,以便通过架构更改和非常简单的程序来搜索以前的辅助列。
I ran across several designs where a #4 was possible:
We used this approach with BLOBs of XML data or even binary data, representing the entire serialized object. The downside is that your auxiliary columns remain non-searchable for all practical purposes. The upside is that you can add new auxiliary columns at will without changing the schema. You can also make schema changes to make previously auxiliary columns searchable with a schema change and a very simple program.
这完全取决于您需要存储的数据类型。
如果它根本不是“关系型”的——例如,网页、文档等的集合——它通常不太适合关系数据库。
如果它是关系型的,但架构高度可变(例如产品目录),您有多种选择:
如果数据变化很大并且您不想进行架构更改以适应变化,则可以使用“实体属性值”或EAV - 尽管这在上下文中存在一些重大缺陷的关系数据库。我认为这就是您对选项 2 的想法。
如果数据确实是相关的,并且数据中至少存在稳定模型的核心,那么您当然可以使用传统的数据库设计技术来提出一个模式。这似乎与选项3相对应。
It all depends on the kind of data you need to store.
If it's not "relational" at all - for instance, a collection of web pages, documents, etc - it's usually not a good fit for a relational database.
If it's relational, but highly variable in schema - e.g. a product catalogue - you have a number of options:
If the data is highly variable and you don't want to make schema changes to accommodate the variations, you can use "entity-attribute-value" or EAV - though this has some significant drawbacks in the context of relational database. I think this is what you have in mind with option 2.
If the data is indeed relational, and there is at least the core of a stable model in the data, you could of course use traditional database design techniques to come up with a schema. That seems to correspond with option 3.
数据集中的每个项目都具有所有这些属性吗?如果是的话,那么一张大桌子可能就可以了(尽管看起来很吓人)。
另一方面,也许您可以对属性进行分组。这个想法是,如果一个项目具有该组中的一个属性,那么它就具有该组中的所有属性。如果您可以创建这样的分组,那么它们可以是单独的表。
那么他们应该分开吗?是的,除非您能够证明执行连接的成本是不可接受的。通过存储过程执行所有 SELECT,稍后您可以轻松地进行非规范化。
Does every item in the data set have all those properties? If yes, then one big table might well be fine (although scary-looking).
On the other hand, perhaps you can group the properties. The idea being that if an item has one of the properties in the group, then it has all the properties in that group. If you can create such groupings, then these could be separate tables.
So should they be separate? Yes, unless you can prove that the cost of performing joins is unacceptable. Perform all SELECTs via stored procedures and you can denormalise later on without much trouble.