存储多类型字段的最佳方式
假设我们有一个可以是多种类型的字段,例如:字符串、日期或 XML 数据类型。
现在我们有两种方法将其存储在数据库中
1- 使用字符串类型字段 + 字段定义类型:失去“类型感知”排序功能,需要转换
2- 单独的表(StringValues、DateValues、Decimal、XML ...等) ):指向值+字段定义类型的外键:某种程度上复杂,性能
如果仅存储唯一值,第二种方法可能具有额外的优势:它将用作索引。
你有什么想法吗?
注1:最好考虑基于MS SQL Server 2008和Linq2SQL的项目
注2:也许我们会在另一个问题中讨论如何实现EAV,我问的是关系存储中的EAV。
注3:类型可以改变,但不会频繁改变
Let's say we have a field that can be in more than one type, for instance: string or date or XML datatypes.
Now we have two methods to store this in a database
1- using a string typed field + field defining type: losing "type-aware" sorting capabilities, needs casting
2- separate tables (StringValues,DateValues,Decimal,XML ...etc):a foreign key pointing to a value + field defining type : somehow complicated, performance
the second method may have an extra advantage if only unique values were stored: it will work as an index.
do you have something in mind ?
Note1: Preferably, consider project based on MS SQL Server 2008 and Linq2SQL
Note2: Maybe we will discuss how to implement EAV in another question, I'm asking about EAV in a relational storage.
Note3: Types can change, but not frequently
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
听起来您正在设计一个 EAV 解决方案,您的表在其中存储值对于多个属性,每行一个值。
EAV 是一种非关系型设计。就正确的关系数据库设计规则而言,没有“正确”的方法来做到这一点。
正确的设计是将每个属性存储在一个表的单独列中。为每列指定正确的数据类型和描述性名称。每列中仅存储相同逻辑类型的值。
如果您需要动态属性,请使用非关系数据管理解决方案。
It sounds like you're designing an EAV solution, where your table stores values for multiple attributes, one value per row.
EAV is a non-relational design. There's no "right" way to do this with respect to proper rules of relational database design.
The proper design is to store each attribute in a separate column of one table. Give each column the right datatype and a descriptive name. Only store values of the same logical type in each column.
If you need dynamic attributes, use a non-relational data management solution.
我会选择第二个选项,并用几个视图隐藏表情况的复杂性。这样,一旦您获得更大的灵活性,您的应用程序仍然可以指向视图而无需更改,并且您可以将基础表重新排列为更干净的内容。
I'd go with the second option and hide the complexity of the table situation with a couple of views. That way once you get more flexibility your applications can still point to the views without needing to be changed and you can rearrange your underlying tables to something a little cleaner.
您可以考虑使用 XML 数据类型吗?如果是这样,您可以使用属性/元素来定义类型。
或者,
SQL Server 2005+ 对 XML 索引提供了一些良好的支持,可以满足您的需求。
从 Linq to SQL 的角度来看,您可能可以拥有一个轻量级类,可以将类型映射到特定的数据类型; XML 反/序列化可能是这里的一个选项。
Can you consider using an XML datatype? If so, you can use an attribute/element to define the type.
Or,
SQL Server 2005+ has some good support for XML indexing that may support your needs.
From a Linq to SQL point of view you can probably have a lightweight class that can map the types to a specific data type; XML de/serialisation may be an option here.
我不确定这是否足够详细来很好地回答这个问题。如果您实际上是在询问两种类型的情况,您还可以考虑一个表,其中每个类型有一列和一个鉴别器。 “正确”的答案可能取决于具体情况,例如要支持的不同类型的数量、速度与空间限制等。
有些人可能会认为最便宜的方法就是最好的方法。具体来说,您认为理解和维护成本最低的方法(通常约为 TCO 的 60%)。
对于所有关于不这样做的建议,我同意如果可能的话。另一方面,SharePoint 就是一个例子,表明这并非不可能。祝你好运!
I'm not sure this is enough detail to answer the question well. If you are literally asking about the two type case, you might also consider a table with a column for each type and a discriminator. The "right" answer may depend on specifics such as number of distinct types to be supported, speed vs. space constraints, etc.
Some might argue that the least expensive approach is the best one. Specifically, the approach that you believe will require the least cost to understand and maintain (often ~60% of TCO).
With regard to all the advice about not doing this, I agree if possible. On the other hand, SharePoint is one example that shows it's not impossible. Good luck!
如果可能的类型数量较少,请使用选项 2(附加表 + 外键)或使用选项 3。
选项 3:
使用一个表,其中包含每种类型的字段和定义哪个字段相关的枚举字段。
如果可能的类型数量很大或不恒定,请使用选项 1(字符串) - 您可以将字符串中的日期存储为 YYYY-MM-DD-HH-MM-SS 以保留排序。
If the number of possible types is small, use option 2 (additional tables + foreign key) or use option 3.
Option 3:
Use one table with a field of each type and an enum field defining which field is relevant.
If the number of possible types is large or not constant, use option 1 (strings) - you can store dates in strings as YYYY-MM-DD-HH-MM-SS to preserve sorting.