用户可定制的数据结构的设计选择?
我计划构建一个应用程序,允许用户动态设置其数据模型(即创建字段、数据结构等)。
我面临着几种技术可能性,但都有缺点。 :
- 在管理屏幕中,更新数据库的 SQL 架构以反映更改。
- 我担心这是一个糟糕的主意,因为应用程序必须拥有数据库的权限。此外,如果每次点击都必须应用新的 SQL 模式,我想我会直接在一个漏洞上运行。这是我见过的大多数可由用户自定义的应用程序所使用的方法。
- 在数据库模式中创建一组通用的额外列,并希望有足够的列用于复杂的数据模型。
- 如果我的应用程序中不允许有超过 X 列,这将很快成为功能限制。
- 在单个表中,使用 ID 列和 XML 列存储所有项目,以存储用户定义的列。
- 这种方法可能会消除前面提到的缺点,因为 SQL 模式将保持静态,但由于 EF(我希望能够使用它)不知道如何管理 XML 数据类型,我将不得不最终使用手动 SqlCommands使用 XML 函数或编写自定义 EF 提供程序,我认为这将是相当大量的工作。
- 这是 Microsoft 为 SharePoint 选择的方法...这让我认为这是最好的方法(或者至少是最不坏的方法)。
- 创建一个“属性”表,其中基本上只有一个 itemId 列、一个属性名称列和一个属性值列。
- 这种方法意味着一个非常非常大的表(X 个项目 * 每个项目的 Y 个属性),
- 我必须以纯文本形式存储我的值,即使它是数字。
我的要求是:
- 保持代码可维护、可单元测试和流行的技术
- 拥有一个包含大量数据的响应式应用程序
- 拥有一个尽可能安全的应用程序
- 允许用户完全自定义他们的应用程序(使用过滤器创建自定义视图/对用户属性进行排序)。
我觉得现在选择正确的设计一定是最好的,因为以后很难改变。
任何反馈将不胜感激。
I'm planning to build an application that allows the user to set up its data model (i.e. create fields, data structure, etc.) dynamically.
I'm facing several technical possibilities, all having drawbacks. :
- In admin screens, update the SQL schema of the DB to reflect the changes.
- I fear this is a terrible idea because of the permissions the application must have on the DB. Moreover, if a new SQL schema must be applied on every click, I imagine I will run directly on a hole. This is the approach I've seen used by most applications customizable by the user.
- Create a set of generic extra columns in the DB schema and hope there are enough columns for the complex data models.
- This will quickly become a functional limitation if I can't allow more than X columns in my app.
- In a single table, store all items with an ID column and an XML column to store the user-defined columns.
- This approach may remove the previously mentioned drawbacks because the SQL schema will remain static, but as EF (which I was hoping to be able to use) does not know how to manage XML data type, I will have to end up with either manual SqlCommands with an XML function or writing a custom EF provider, which I imagine will be quite a lot of work.
- This is the approach chosen by Microsoft for SharePoint... this makes me think it's the best one (or at least the least bad).
- Create a "properties" table with basically just an itemId column, a property name column, and a property value column.
- This approach implies a very very large table (X items * Y properties per item)
- I will have to store my values in plain text, even if it's numerical.
My requirements are :
- Keep the code maintainable, unit-testable, and all-fashioned techniques
- Have a responsive application with a large amount of data
- Have an as secured as possible application
- Allow the users to fully customize their application (create a custom view with filter/sort on user properties).
I feel the choice of the correct design must be the best one now because it would be tough to change this later.
Any feedback would be appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(5)
一种选择是使用 NoSQL 数据库,例如 MongoDB,它是无模式的。新字段不需要预先定义(没有模式修改的麻烦),并且不同的记录可以有不同的字段。这是像这样的 NoSQL 存储的好处之一。
例如,在 mongo 中,您的“表”可以合法地包含这两条记录:
添加新字段就像开始将其包含在记录中一样简单。
根据我的经验,在像 SQL Server 这样的 RDBMS 中拥有完全灵活/动态的模式可能会有点痛苦,并且很难实现高性能。我对您列出的选项 1) 和 3) 有过经验。当数据存储为 XML 时,我最终通常需要将数据分解为关系形式以达到某些目的。
One option would be to use a NoSQL database like MongoDB which is schema-less. New fields don't need to be defined up front (no schema modification headaches) and different records can have different fields. This is one of the benefits of a NoSQL store like this.
e.g. in mongo, your "table" could have these 2 records within in legimitely:
Adding in a new field is as simple as just start including it in records.
In my experience, having a fully flexible/dynamic schema in a RDBMS like SQL Server can be a bit of a pain and be challenging to achieve high performance. I've had experience with options 1) and 3) that you listed. When data was stored as XML, I ended up usually needing to shred the data out into relational form anyway for certain purposes.
必须要说的是,任何事情对于任何人来说发挥全力都不是100%现实的。
假设您使用的是关系数据库,我会选择选项#1。您仍然有机会利用使 RDBMS 快速运行的存储设计。您可以通过使用存储过程进行 DDL 更改并限制这些 SP 的执行权限来降低安全风险。
选项 2 可以完成,但在尝试确定小部件颜色是否存储在 UDFText39 或 UDFText52 中时,可能会出现维护问题。
“大量数据”似乎排除了选项 3,除非您采用非关系型解决方案。在 RDBMS 中,这会非常慢。
选项 #4 是一个全面的坏主意,因为您不仅被迫混合数据域(颜色、大小等),而且还被迫混合数据类型。远离这个。
It must be said, anything to anyone with full performance is not 100% realistic.
Assuming you're using a relational database, I would go with option #1. You still have the opportunity to take advantage of the storage design that makes an RDBMS quick. You could reduce your security risk by using stored procedures to make your DDL changes, and limit the execute rights on those SPs.
Option 2 can be done, but there could be maintenance problems when trying to figure out if the widget color is stored in UDFText39 or UDFText52.
"Large amount of data" would seem to rule out option 3 unless you go with a non-relational solution. In an RDBMS, that would be quite slow.
Option #4 is an all-around bad idea as you're forced to not only mix data domains (colors, sizes, etc.) but data types as well. Stay away from this one.
我想说最干净的解决方案是#4。
为您要使用的数据类型创建一个表。
- 数值
- 字符串值
- 日期时间值
- ...
所以你没有一个令人难以置信的大表,并且你同时是强类型的。
唯一的限制:
您只能使用多种受支持的数据类型。但这是恕我直言的一个自然限制。
i would say that the cleanest solution would be #4.
create a table per datatype you want to use.
- numerivalue
- stringvalue
- datetime value
- ...
so you do not have an incredible huge table and your are strong typed at the same time.
only limitation:
you are limited to a number of supported datatypes. but this is IMHO a natural limitation.
我认为您不会不加修改地使用 EF 或其他 ORM 框架。您将需要自定义代码,但我们喜欢构建新事物,不是吗?
我看到两个不错的解决方案:
1)与您的 1. 解决方案类似,使用 2 个表,一个包含列定义,第二个用于数据。例如,您的定义表可能如下所示:
数据表包含填充实际数据的通用命名列。
当您查询“数据类型”时,您会读取“定义”,然后查询实际数据。您可以在定义表中存储其他属性,例如验证器...
3) 如果您使用的是 MS SQL >= 2008,第三种解决方案看起来不错。我仍然建议为每个“数据类型”使用单独的表。
我不推荐解决方案 2,它看起来像是一个糟糕的 hack。
解决方案 4. 看起来很干净,但该方法不适合大型数据集。
I don't think you might to use EF or other ORM framework without modification. You will need custom code, but we like to build new things, are we?
I see two not so bad solutions:
1) Similar like your 1. solution, use 2 tables, one with definition of columns and second for data. For example, your definition table may look like this:
Data table contains generic named columns filled with actual data.
When you query your "data type" you read "definition" and then query actual data. You can store additional attributes in definition table, like validators...
3) If you are using MS SQL >= 2008, third solution look like good one. I'm still recomending separate table for every "data type".
I don't recomend solution 2, it looks like a bad hack.
Solution 4. looks clean, but that approach isn't appropriate for large datasets.
看起来这将是 EAV 模式(实体属性值)的一个很好的候选者,它类似于您描述的选项之一。
实体属性值模式。
要进一步阅读概念性更强的内容,您可以阅读这篇文章> 属性值系统。
It looks like this would be a good candidate for the EAV pattern (entity attribute value), which is similar to one of the options you described.
Entity Attribute Value pattern.
For further reading on a slightly more conceptual level, you could read this article > Attribute-value-system.