如何构建(标准化?)物理参数数据库?

发布于 2024-08-30 21:51:46 字数 2145 浏览 3 评论 0原文

我收集了与不同项目相关的物理参数。例如:

Item, p1, p2, p3
a,     1,  2,  3
b,     4,  5,  6
[...]

其中 px 代表参数 x

我可以继续按照所提供的方式存储数据库;架构将是

CREATE TABLE t1 (item TEXT PRIMARY KEY, p1 FLOAT, p2 FLOAT, p3 FLOAT);

我可以使用以下语句检索所有项目的参数 p1

SELECT p1 FROM t1;

第二种选择是采用如下架构:

CREATE TABLE t1 (id INT PRIMARY KEY, item TEXT, par TEXT, val FLOAT)

如果您有很多参数(就像我一样),这看起来会简单得多。然而,参数检索似乎很尴尬:

SELECT val FROM t1 WHERE par == 'p1'

你有什么建议?应该选择“pivoted”(第一个)版本还是“id, par, val”(第二个)版本?

非常感谢。

编辑

作为参考,我在SQLAlchemy 示例站点(垂直映射):

"""Mapping a vertical table as a dictionary.

This example illustrates accessing and modifying a "vertical" (or
"properties", or pivoted) table via a dict-like interface.  These are tables
that store free-form object properties as rows instead of columns.  For
example, instead of::

  # A regular ("horizontal") table has columns for 'species' and 'size'
  Table('animal', metadata,
        Column('id', Integer, primary_key=True),
        Column('species', Unicode),
        Column('size', Unicode))

A vertical table models this as two tables: one table for the base or parent
entity, and another related table holding key/value pairs::

  Table('animal', metadata,
        Column('id', Integer, primary_key=True))

  # The properties table will have one row for a 'species' value, and
  # another row for the 'size' value.
  Table('properties', metadata
        Column('animal_id', Integer, ForeignKey('animal.id'),
               primary_key=True),
        Column('key', UnicodeText),
        Column('value', UnicodeText))

Because the key/value pairs in a vertical scheme are not fixed in advance,
accessing them like a Python dict can be very convenient.  The example below
can be used with many common vertical schemas as-is or with minor adaptations.

"""

I have a collection of physical parameters associated with different items. For example:

Item, p1, p2, p3
a,     1,  2,  3
b,     4,  5,  6
[...]

where px stands for parameter x.

I could go ahead and store the database exactly as presented; the schema would be

CREATE TABLE t1 (item TEXT PRIMARY KEY, p1 FLOAT, p2 FLOAT, p3 FLOAT);

I could retrieve the parameter p1 for all the items with the statement:

SELECT p1 FROM t1;

A second alternative is to have an schema like:

CREATE TABLE t1 (id INT PRIMARY KEY, item TEXT, par TEXT, val FLOAT)

This seems much simpler if you have many parameters (as I do). However, the parameter retrieval seems very awkward:

SELECT val FROM t1 WHERE par == 'p1'

What do you advice? Should go for the "pivoted" (first) version or the id, par, val (second) version?

Many thanks.

EDIT

For reference, I found the following persistence pattern in the SQLAlchemy examples site (the vertical mapping):

"""Mapping a vertical table as a dictionary.

This example illustrates accessing and modifying a "vertical" (or
"properties", or pivoted) table via a dict-like interface.  These are tables
that store free-form object properties as rows instead of columns.  For
example, instead of::

  # A regular ("horizontal") table has columns for 'species' and 'size'
  Table('animal', metadata,
        Column('id', Integer, primary_key=True),
        Column('species', Unicode),
        Column('size', Unicode))

A vertical table models this as two tables: one table for the base or parent
entity, and another related table holding key/value pairs::

  Table('animal', metadata,
        Column('id', Integer, primary_key=True))

  # The properties table will have one row for a 'species' value, and
  # another row for the 'size' value.
  Table('properties', metadata
        Column('animal_id', Integer, ForeignKey('animal.id'),
               primary_key=True),
        Column('key', UnicodeText),
        Column('value', UnicodeText))

Because the key/value pairs in a vertical scheme are not fixed in advance,
accessing them like a Python dict can be very convenient.  The example below
can be used with many common vertical schemas as-is or with minor adaptations.

"""

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

全部不再 2024-09-06 21:51:46

除了第二种方法的灵活性之外,另一个优点是参数可以是参数表中的行,将有关该参数的数据存储为数据库的一部分,而不是作为模式的列。它还自然地导致数据的 RDF 三重表示。

顺便说一句,您不需要添加的键字段,使 item 和 par 成为联合主键

CREATE TABLE t1 ( item TEXT, par TEXT, val FLOAT, PRIMARY KEY (item, par))

第二种方法的一个限制是所有参数的值的数据类型必须相同 - 如果所有参数都浮动,则可以,但为了一般性,这可能必须是字符串,伴随而来的验证丢失和需要编程数据转换。

查询速度会受到影响,但您可以通过查询获取某个术语的所有参数,这样的查询

SELECT par,value FROM t1 WHERE item='qitem'

比其他查询更容易转换为演示格式。

In addition to the flexibility of the second approach, a further advantage is that parameters can then be rows in a parameter table, storing data about that parameter as part of the database, rather than as columns of the schema. It also leads naturally to an RDF triple representation of the data.

BTW you don't need the added key field, make item and par a joint primary key

CREATE TABLE t1 ( item TEXT, par TEXT, val FLOAT, PRIMARY KEY (item, par))

One limitation of the second approach is that the datatype of value must be the same for all parameters - OK if all floats but for generality this might have to be string with attendant loss of validation and the need for programatic data conversion.

Query speed will be affected, but you can get all the parameters for a term with a query like

SELECT par,value FROM t1 WHERE item='qitem'

which is easier to transform to a presentation format than the alternative.

心的憧憬 2024-09-06 21:51:46

这是性能与可扩展性的权衡。如果您从不打算添加更多 px 列,我认为使用第一种方法是安全的,但是,如果您预计将来会有更多 px 列,您可能需要使用名称/值方法。

如果最终获得大量数据,名称-值方法可能会降低性能。使用静态列方法,您的查询将会更快。

您还可以进行混合,从静态列方法开始,并添加对锁定项目的“扩展”表的支持,以便您将来可以添加其他属性。

It's a trade-off of performance vs. extensibility. If you never intend to add any more px columns, I think you are safe with using the first approach, however, if you anticipate that there will be more px columns in the future, you might want to do the name/value approach.

The name-value approach can get nasty for performance if you end up with a lot of data. Your queries will be faster using the static column approach.

You could also do a hybrid where you start with the static column approach, and add support for an "extension" table that keys off the Item, so that you can add additional properties in the future.

明天过后 2024-09-06 21:51:46

标准化途径是将 pX 值放置在按 ID 引用的表中。

ID     Item
1      a 
2      b 
3      c 


ID     Item    P
1      1       1
2      1       2
3      1       3

The normalized route would be to place the pX values in a table reference by ID.

ID     Item
1      a 
2      b 
3      c 


ID     Item    P
1      1       1
2      1       2
3      1       3
执妄 2024-09-06 21:51:46

如果您假设 px 将增长到超过三个值(p1、p2 和 p3)到 p4 等,那么第一种方法将会失败,而继续添加 p4、p5 等列似乎是一种有缺陷的方法。

老实说,吸引我的方法是将项目和参数分离到不同的表中,然后使用链接实体将它们连接起来:

Item
  |
-----
| | |
ItemParameter

| | |
-----
  |
Parameter

这样一个项目可以有许多参数,并且一个参数可以存在于许多项目中。

所以项目 a 可以有参数 p1、p2 和 p3

(Item)
a
(ItemParameter)
a p1
a p2
a p2
(Parameter)
p1
p2
p3

或者项目 b 可以有参数 p1、p2、p6、p10 和 p19

(Item)
b
(ItemParameter)
b p1
b p2
b p6
b p10
b p19
(Parameter)
p1
p2
p6
p10
p19

等等

If you are assuming that px will grow beyond three values (p1, p2 and p3) onto p4 and so on then the first approach is going to fail, and keeping adding on columns for p4, p5 and so on seems a flawed approach.

To be honest the approach that attracts me would be to separate the item and the parameters into different tables and then use a link entity to join them:

Item
  |
-----
| | |
ItemParameter

| | |
-----
  |
Parameter

So that an item can have many parameters and a parameter can exist for many items.

So Item a can have Parameters p1, p2 and p3

(Item)
a
(ItemParameter)
a p1
a p2
a p2
(Parameter)
p1
p2
p3

Or Item b can have Parameters p1, p2, p6, p10 and p19

(Item)
b
(ItemParameter)
b p1
b p2
b p6
b p10
b p19
(Parameter)
p1
p2
p6
p10
p19

and so on

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文