SQL 中存储多级描述的最佳方法是什么？

发布于 2024-10-09 13:11:21 字数 1601 浏览 0 评论 0原文

我需要一个新的视角来了解如何设计可靠且高效的 SQL 数据库来存储多级数据数组。

这个问题适用于很多情况，但我想出了这个例子：
有数百种产品。每个产品都有未定义数量的零件。每个部分都由多个元素组成。

所有产品都以相同的方式描述。所有零件都需要相同的字段来描述它们（例如：价格、重量、零件名称），所有零件的所有元素也具有统一的设计（例如：元素代码、制造商）。简单明了。
一个元素可能只与一部分相关，而每一部分只与一种产品相关。

我想出了三个表的想法：

Products:
--------------------------------------------
prod_id   prod_name   prod_price   prod_desc 
1            hoover          120      unused

Parts:
----------------------------------------------------
part_id  part_name  part_price  part_weight  prod_id
      3     engine          10           20        1

和finally

Elements:
---------------------------------------
el_id  el_code  el_manufacturer part_id
    1    BFG12               GE       3

现在，选择所需的产品，从 prod_id 相同的 PARTS 中选择所有产品，然后从 part_id 匹配的 ELEMENTS 中选择所有产品 - 经过多次查询后，您已经获得了所有数据。

我只是不确定这是否是正确的方法。
我还有另一个想法，没有 ELEMENTS 表。
这会减少查询，但我有点担心这可能是蹩脚和不好的做法。
PARTS 表中多了两个字段，而不是 ELEMENTS 表，因此它看起来像这样：
零件 ID、零件名称、零件价格、零件重量、产品 ID、零件_EL_代码、零件_EL_制造商
它们将是 text 类型，对于每个部分，有关元素的信息将存储为字符串，这样：

part_el_code         | code_of_element1; code_of_element2; code_of_element3

part_el_manufacturer | manuf_of_element1; manuf_of_element2; manuf_of_element3

然后我们需要的是从这些字段中爆炸（）数据，然后我们得到数组，很简单显示。当然这并不完美并且有一些局限性，但是这个想法可以吗？

我提出第二个变体的原因是第三个表 - Elements - 最终会变得相当大。如果有 10,000 个产品，每个产品有 4 个零件，每个零件平均有 3 个元素 - 这意味着 Elements 表中必须有 120,000 行。老实说 - 我真的不知道这是否会导致性能问题。

我应该遵循第一个想法吗？或者也许有更好的方法来解决这个问题？

原文

I need a new perspective on how to design a reliable and efficient SQL database to store multi-level arrays of data.

This problem applies to many situations but I came up with this example:
There are hundreds of products. Each product has an undefined number of parts. Each part is built from several elements.

All products are described in the same way. All parts would require the same fields to describe them (let's say: price, weight, part name), all elements of all parts also have uniform design (for example: element code, manufacturer). Plain and simple.
One element may be related to only part, and each part is related to one product only.

I came up with idea of three tables:

Products:
--------------------------------------------
prod_id   prod_name   prod_price   prod_desc 
1            hoover          120      unused

Parts:
----------------------------------------------------
part_id  part_name  part_price  part_weight  prod_id
      3     engine          10           20        1

and finally

Elements:
---------------------------------------
el_id  el_code  el_manufacturer part_id
    1    BFG12               GE       3

Now, select a desired product, select all from PARTS where prod_id is the same, and then select all from ELEMENTS where part_id matches - after multiple queries you've got all data.

I'm just not sure if this is the right approach.
I've got also another idea, without ELEMENTS table.
That would decrease queries but I'm a bit afraid it might be lame and bad practice.
Instead of ELEMENTS table there are two more fields in the PARTS table, so it looks like this:
part_id, part_name, part_price, part_weight, prod_id, part_el_code, part_el_manufacturer
they would be text type, and for each part, information about elements would be stored as strings, this way:

part_el_code         | code_of_element1; code_of_element2; code_of_element3

part_el_manufacturer | manuf_of_element1; manuf_of_element2; manuf_of_element3

Then all we need is to explode() data from those fields, and we get arrays, easy to display.
Of course this is not perfect and has some limitations, but is this idea ok?

The reason I came up with the second variant is that the third table - Elements - would eventually become quite big. If there is 10,000 products, 4 parts for every product, and on average 3 elements per part - it means there must be 120,000 rows in the Elements table. And to be honest - I don't really know if that would contribute to performance problems.

Should I just go with the first idea? Or maybe there is a better approach to this problem?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

雨落□心尘 2024-10-16 13:11:21

描述事物之间的关系正是关系数据库所做的事情。任何关于 SQL 的书都应该给出很多例子，我建议寻找这样的书来更好地理解这些问题，但我会快速回答你的问题。

对于包含 3 个表的简单解决方案，您的方向是正确的，但您不需要三个 select 语句。这就是 join 的用途。在您的示例中，以下 select 语句将为您提供特定产品的所有数据。

select * 
from product
join parts on parts.prod_id = product.prod_id
join elements on elements.part_id = part.parts_id
where product.prod_id = x

这很好，并且完全可以满足您的要求。

至于你的第二个想法，这真的很糟糕。您正在使用关系数据库，它允许您存储数据集并将其关联。将列表存储在字符串值中会忽略关系数据库设计良好的功能。

但是，有一种方法可以改进您的设计。我认为产品和零件或零件和元素可能并不总是具有一对一的关系。所以我建议如下。

首先从零件表中删除 prod_id，然后从元素表中删除part_id。

然后创建两个“连接表”。我们称它们为 prodparts 和 partselement。它们看起来像这样：

prodparts would contain two columns prod_id and part_id (and maybe a count?)
partselement would contain two columns part_id and el_id (and maybe a count?)

现在这些表中的每一行都将描述这种关系。但这是一个简洁的部分，多个产品可以使用相同的部分，多个部分可以使用相同的元素。

选择只是稍微复杂一些，如下所示：

select * 
from product
join prodparts on procparts.prod_id = product.prod_id
join parts on parts.part_id = prodparts.part_id
join partselement on partelement.part_id = parts.part_id
join elements on elements.el_id = partelement.parts_id
where product.prod_id = x

Describing relationships between things is exactly what relational databases do. Any book on SQL should give many examples and I suggest looking for such a book to understand these issues better but I will give a quick answer to your question.

You are on the right track for the simple solution with your 3 tables, but you don't need three select statements. That is what the join is used for. With your example the following select statement would give you all the data for a particular product.

select * 
from product
join parts on parts.prod_id = product.prod_id
join elements on elements.part_id = part.parts_id
where product.prod_id = x

This is fine and will do exactly what you require.

As far as you 2nd idea, it is really bad. You are working with a relational database, it allows you to store sets of data and relate it. To store a list in a string value is to ignore the exact functionality that relational databases are designed to do well.

However, there is a way to improve your design. I think you might not always have a one-to-one relationship for product and part or part and element. So I would suggest the following.

First remove prod_id from the parts table and then remove part_id from the elements table.

Then create two "joining tables". Lets call them prodparts and partselement. They would look like this:

prodparts would contain two columns prod_id and part_id (and maybe a count?)
partselement would contain two columns part_id and el_id (and maybe a count?)

Now each row in these tables would describe the relationship. BUT and this is the neat part, multiple products could use the same part and multiple parts could use the same element.

The select is only slightly more complicated and looks like this:

select * 
from product
join prodparts on procparts.prod_id = product.prod_id
join parts on parts.part_id = prodparts.part_id
join partselement on partelement.part_id = parts.part_id
join elements on elements.el_id = partelement.parts_id
where product.prod_id = x

回复收藏 0 原文