针对一些未知数据的数据库表设计
因此,虽然我没有数据库设计背景,但我的任务是设计一个网络应用程序,最终用户将在其中输入产品及其产品规格。通常我认为我只会为他们将输入的每种规格类型创建行。相反,他们有各种不共享相同规格类型的产品,所以我的问题是,组织这些数据的最有效和面向未来的方法是什么?我倾向于将序列化对象推入通用“数据”行,但是您能够对此数据进行全文搜索吗?还有其他途径可以探索吗?
So, not having come from a database design background, I've been tasked with designing a web app where the end user will be entering products, and specs for their products. Normally I think I would just create rows for each of the types of spec that they would be entering. Instead, they have a variety of products that don't share the same spec types, so my question is, what's the most efficient and future-proof way to organize this data? I was leaning towards pushing a serialized object into a generic "data" row, but then are you able to do full-text searches on this data? Any other avenues to explore?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
将产品和规格分成两个表,如下所示:
当您知道产品 id 时,获取产品的所有规格:
当您知道产品 id、规格名称和所述规格的值时,向产品添加规格:
所以在您之前可以向产品添加规格,该产品必须存在。此外,您不能重复使用多个产品的规范。这将需要一个更复杂的解决方案:)即...
这次三个表:
当您知道产品ID时获取产品的所有规格:
现在,添加规格变得有点棘手,因为您必须检查如果该规范已经存在。因此,这比第一种方法要重一些,因为数据库上有更多查询,并且应用程序中有更多逻辑。
首先,查找规范的 id:
如果没有返回 id,则意味着该规范不存在,因此必须创建它:
接下来,要么使用 select 查询中的 id,要么获取最后一个插入 id 来查找新创建的规范的 ID。将该 ID 与获取新规格的产品 ID 一起使用,并将两者链接在一起:
但是,这意味着您必须为每个特定规格创建一行。例如,如果您有鞋子的尺码,则每个已知的鞋子尺码都会有一排,
依此类推。我认为这应该足够了。
split products and specifications into two tables like this:
get all the specifations of a product when you know the product id:
add a specification to a product when you know the product id, the specification's name and the value of said specification:
so before you can add specifications to a product, this product must exist. also, you can't reuse specifications for several products. that would require a somewhat more complex solution :) namely...
three tables this time:
get all the specifations of a product when you know the product id:
now, adding a specification becomes a little bit more tricky, cause you have to check if that specification already exists. so this will be a little heavier than the first way of doing this, since there are more queries on the db, and there's more logic in the application.
first, find the id of the specification:
if no id is returned, this means that said specification doesn't exist, so it must be created:
next, either use the id from the select query, or get the last insert id to find the id of the newly created specification. use that id together with the id of the product that's getting the new specification, and link the two together:
however, this means that you have to create one row for every specific specification. e.g. if you have size for shoes, there would be one row for every known shoe size
and so on. i think this should be enough though.
您可以查看使用 EAV 模型。
You could take a look at using an EAV model.
我从未建立过产品数据库,但我可以向您指出一个数据模型。它是 Database Answers 中 200 多个可供使用的模型之一。 这是模型
如果您不喜欢这个,您可以找到 15 个面向产品的数据库的不同数据模型。单击“数据模型”获取列表并向下滚动到“产品”。
你应该在那里学到一些好的设计理念。
I've never built a products database, but I can point you to a data model for that. It's one of over 200 models available for the taking, at Database Answers. Here is the model
If you don't like this one, you can find 15 different data models for Product oriented databases. Click on "Data Models" to get a list and scroll down to "Products".
You should pick up some good design ideas there.
这是一个非常常见的问题——针对不同的场景有不同的解决方案。
如果不同类型的产品及其属性在开发时是固定且已知的,您可以查看 Craig Larman 的书中的描述 (http://www.amazon.com/Applying-UML-Patterns-Introduction-Object-Oriented/ dp/0131489062/ref=sr_1_1/002-2801511-2159202?ie=UTF8&s=books&qid=1194351090&sr=1-1) - 有一节介绍对象关系映射以及如何处理继承。
这归结为“将所有可能的列放入一张表中”、“为每个子类创建一张表”或“将所有基类项放入一个公共表中,并将子类数据放入自己的表中”。
这是迄今为止使用关系数据库最自然的方式 - 它允许您创建报告,如果您喜欢的话,可以使用现成的工具进行对象关系映射,并且您可以使用标准概念,例如“not null” 当然,如果您在开发时不知道数据属性,
则必须创建灵活的数据库模式。
我见过 3 种通用方法。
第一个是达沃哥特兰所描述的。我为一家电子商务商店构建了一个类似的解决方案;它效果很好,让我们能够非常灵活地处理产品数据库。即使有 50 万个产品,它的表现也非常好。
主要缺点是创建检索查询 - 例如“查找类别 y 中价格低于 x 且制造商为 z 的所有产品”。引入新的开发人员也很棘手——他们的学习曲线相当陡峭。
这也迫使我们将很多关系概念推到应用层。例如,很难创建其他表(例如“制造商”)的外键并使用标准 SQL 功能强制执行它们。
我看到的第二种方法是您提到的方法 - 以某种序列化格式存储变量数据。这在查询时是一种痛苦,并且与关系模型具有相同的缺点。总的来说,我只想对不需要查询或推理的数据使用序列化。
我看到的最终解决方案是接受新产品类型的添加总是需要一定程度的开发工作 - 如果没有其他事情,您必须构建 UI。我见过一些应用程序使用脚手架风格的方法在创建新产品类型时自动生成底层数据库结构。
这是一项相当重大的任务——只真正适合大型项目,尽管使用 ORM 工具通常会有所帮助。
This is a pretty common problem - and there are different solutions for different scenarios.
If the different types of product and their attributes are fixed and known at development time, you could look at the description in Craig Larman's book (http://www.amazon.com/Applying-UML-Patterns-Introduction-Object-Oriented/dp/0131489062/ref=sr_1_1/002-2801511-2159202?ie=UTF8&s=books&qid=1194351090&sr=1-1) - there's a section on object-relational mapping and how to handle inheritance.
This boils down to "put all the possible columns into one table", "create one table for each sub class" or "put all base class items into a common table, and put sub class data into their own tables".
This is by far the most natural way of working with a relational database - it allows you to create reports, use off-the-shelf tools for object relational mapping if that takes your fancy, and you can use standard concepts such as "not null", indexing etc.
Of course, if you don't know the data attributes at development time, you have to create a flexible database schema.
I've seen 3 general approaches.
The first is the one described by davogotland. I built a solution on similar lines for an ecommerce store; it worked great, and allowed us to be very flexible about the product database. It performed very well, even with half a million products.
Major drawbacks were creating retrieval queries - e.g. "find all products with a price under x, in category y, whose manufacturer is z". It was also tricky bringing in new developers - they had a fairly steep learning curve.
It also forced us to push a lot of relational concepts into the application layer. For instance, it was hard to create foreign keys to other tables (e.g. "manufacturer") and enforce them using standard SQL functionality.
The second approach I've seen is the one you mention - storing the variable data in some kind of serialized format. This is a pain when querying, and suffers from the same drawbacks with the relational model. Overall, I'd only want to use serialization for data you don't have to be able to query or reason about.
The final solution I've seen is to accept that the addition of new product types will always require some level of development effort - you have to build the UI, if nothing else. I've seen applications which use a scaffolding style approach to automatically generate the underlying database structures when a new product type is created.
This is a fairly major undertaking - only really suitable for major projects, though the use of ORM tools often helps.