在数据库结构中保存带有计算列的用户定义视图

发布于 2024-10-14 18:05:36 字数 2920 浏览 1 评论 0原文

我从数据库表中组织的各种传感器获取数据。

我让用户能够创建自己的数据视图，添加列，每个传感器一个，按数据分组，就像这样

 View #1
+---------------------+------+----+-----+
|      datetime       |  v1  | v2 |  v3 |
+---------------------+------+----+-----+
| 2010-09-13 00:05:00 | 40.9 |  1 | 0.3 |
| 2010-09-13 00:10:00 | 41.0 |  2 | 0.3 |
| 2010-09-13 00:15:00 | 41.1 |  4 | 0.3 |
+---------------------+------+----+-----+

为了存储这种视图，我使用两个表：views 和 view_opts

views
    id
    name

view_opts
    id
    id_view
    id_sensor
    ord #for column ordering

现在我必须更进一步：添加更多列，不是直接来自传感器数据，而是作为其他列的计算，如下所示：

 View #1
+---------------------+------+----+-----------+-----+
|      datetime       |  v1  | v2 | (v1+v2)/2 |  v3 |
+---------------------+------+----+-----------+-----+
| 2010-09-13 00:05:00 | 40.9 |  1 |   20.95   | 0.3 |
| 2010-09-13 00:10:00 | 41.0 |  2 |   21.5    | 0.3 |
| 2010-09-13 00:15:00 | 41.1 |  4 |   22.55   | 0.3 |
+---------------------+------+----+-----------+-----+

如果这在 SQL 中不难实现（我将旧的 SELECT 包装在一个新的 SELECT 中进行计算），我必须扩展我存储的方式以某种方式查看。

我想了几种方法，都有一些优点和很多缺点......

有什么好主意吗？

================

编辑：

我的想法之一是向 view-opts 添加一个文本字段，让用户以类似 SQL 的语法添加自己的计算：

(v1+v2)/2

以防传感器外键view_opt 中的文本为空，用于查询，类似于：

SELECT v1, v2, (v1+v2)/2, v3 FROM (
    SELECT v1, v2, v3 FROM ...
)

当然我应该解析 (v1+v2)/2 部分以查找语法错误、注入（不太可能，此功能仅用于可信管理）没有兴趣破坏自己数据的用户）。

===============

添加：

这里是我的数据库中关键表的一些DDL（我修剪了这个问题不需要的一些字段和表）：

CREATE TABLE acquisitions (
    id INTEGER NOT NULL, 
    id_cu INTEGER NOT NULL, 
    datetime DATETIME NOT NULL, 
    PRIMARY KEY (id), 
    FOREIGN KEY(id_cu) REFERENCES ctrl_units (id), 
)

CREATE TABLE data (
    id INTEGER NOT NULL, 
    id_acq INTEGER NOT NULL, 
    id_meas INTEGER NOT NULL, 
    value FLOAT, 
    PRIMARY KEY (id), 
    FOREIGN KEY(id_acq) REFERENCES acquisitions (id)
)

CREATE TABLE sensors (
    id_cu INTEGER NOT NULL, 
    id_meas INTEGER NOT NULL, 
    name VARCHAR(40) NOT NULL, 
    PRIMARY KEY (id_cu, id_meas), 
    FOREIGN KEY(id_cu) REFERENCES ctrl_units (id)
)

CREATE TABLE ctrl_units (
id INTEGER NOT NULL, 
name VARCHAR(40) NOT NULL, 
PRIMARY KEY (id)
)

简要描述 ：我有 ctrl_units 通过网络以 csv 之类的文件发送数据，每个 ctrl_unit 有许多 传感器，一个 传感器 被标识为id_cu-id_meas。

这不是我找到的解决方案，而是 ctrl_units 通过网络发送数据的真正方式。

每次采集都是 csv 中的一行，包含日期时间和许多传感器测量值，如 id_meas, value（简化版本：也就是说，每个传感器在采集中发送更多测量值，如 < code>id_meas, id_elab, value 其中 id_elab 根据传感器类型有不同的含义）

以这种方式工作，传感器的数据（结果列）由其控制单元 ID（存储在父采集）及其测量类型 ID（存储在单个数据中）。我使用此查询来提取数据。

原文

I have acquisition data from various sensors organized in db tables.

I give user the ability to create his own Data Views adding columns, one for each sensor, grouped by data, like this

 View #1
+---------------------+------+----+-----+
|      datetime       |  v1  | v2 |  v3 |
+---------------------+------+----+-----+
| 2010-09-13 00:05:00 | 40.9 |  1 | 0.3 |
| 2010-09-13 00:10:00 | 41.0 |  2 | 0.3 |
| 2010-09-13 00:15:00 | 41.1 |  4 | 0.3 |
+---------------------+------+----+-----+

To store this kind of Views I use two tables: views and view_opts

views
    id
    name

view_opts
    id
    id_view
    id_sensor
    ord #for column ordering

Now I have to go one step further: add more columns, not coming directly from sensor data but as a computation of other colums, like this:

 View #1
+---------------------+------+----+-----------+-----+
|      datetime       |  v1  | v2 | (v1+v2)/2 |  v3 |
+---------------------+------+----+-----------+-----+
| 2010-09-13 00:05:00 | 40.9 |  1 |   20.95   | 0.3 |
| 2010-09-13 00:10:00 | 41.0 |  2 |   21.5    | 0.3 |
| 2010-09-13 00:15:00 | 41.1 |  4 |   22.55   | 0.3 |
+---------------------+------+----+-----------+-----+

If this is not hard to achieve in SQL (I wrap the old SELECT in a new one that makes computation) I have to extend the way I store Views someway.

I thought at several ways, all with some pro and a lot of cons...

Any good idea?

================

Edit:

one of my thoughts was to add a text field to view-opts to let user add his own computations in SQL-like syntax:

(v1+v2)/2

in case sensor foreign keys are null in view_opt the text is used in query, something like:

SELECT v1, v2, (v1+v2)/2, v3 FROM (
    SELECT v1, v2, v3 FROM ...
)

Of course I should parse the (v1+v2)/2 part for syntax errors, injections (improbable, this functionality is only for trusted administration users that have no interest in damaging their own data).

===============

Add:

here are som DDL for key tables in my database (I trimmed some fields and tables not needed for this question):

CREATE TABLE acquisitions (
    id INTEGER NOT NULL, 
    id_cu INTEGER NOT NULL, 
    datetime DATETIME NOT NULL, 
    PRIMARY KEY (id), 
    FOREIGN KEY(id_cu) REFERENCES ctrl_units (id), 
)

CREATE TABLE data (
    id INTEGER NOT NULL, 
    id_acq INTEGER NOT NULL, 
    id_meas INTEGER NOT NULL, 
    value FLOAT, 
    PRIMARY KEY (id), 
    FOREIGN KEY(id_acq) REFERENCES acquisitions (id)
)

CREATE TABLE sensors (
    id_cu INTEGER NOT NULL, 
    id_meas INTEGER NOT NULL, 
    name VARCHAR(40) NOT NULL, 
    PRIMARY KEY (id_cu, id_meas), 
    FOREIGN KEY(id_cu) REFERENCES ctrl_units (id)
)

CREATE TABLE ctrl_units (
id INTEGER NOT NULL, 
name VARCHAR(40) NOT NULL, 
PRIMARY KEY (id)
)

Brief description: I have ctrl_units sending data over a network in csv like files, every ctrl_unit has many sensors, a single sensor is identified as id_cu-id_meas.

This is not a solution I found but the real way ctrl_units send data over the network.

Every acquisition is a row in csv, with datetime and many sensor measures as id_meas, value (simplified version: to say it all every sensor sends more measures in an acquisition as id_meas, id_elab, value where id_elab has different meanings according to sensor type)

Working this way data of a sensor (a result column) is identified by it's Control Unit id (stored in parent acquisition) and it's Measure Type id (stored in single data).
I use this query to extract data.

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

誰ツ都不明白 2024-10-21 18:05:36

为了能够保留查询中的顺序，并允许计算列出现在正确的位置，您确实需要 view_opts 表中的某些内容作为占位符。怎么样：

view_opts
id
id_view
isCalc bit --contains 0 for Sensor and 1 for Calc (not strictly necessary)
id_sensor --nullable
id_viewcalc --nullable
ord #for column ordering

可为空的列应该意味着可以维护 fk 关系。

然后是一个额外的计算表（以避免在 view_opts 表中产生垃圾邮件）：

view_calcs
id_viewcalc
...

然后您可以选择使一组预定义的计算可用并引用它们以及所需的列，或者允许一组完全可配置的计算（我建议第一个，因为第二个可能会导致一个可怕的用户界面）。

In order to make it possible to preserve the ordering in the query, and allow the computed columns appear in the right place, you really need something in the view_opts table as a place holder. What about something like:

view_opts
id
id_view
isCalc bit --contains 0 for Sensor and 1 for Calc (not strictly necessary)
id_sensor --nullable
id_viewcalc --nullable
ord #for column ordering

The nullable columns should mean that fk relations can be maintained.

Then an extra table for the calcs (to avoid spamming up the view_opts table):

view_calcs
id_viewcalc
...

You then have the option to make a predefined set of calcs available and reference these, along with the required columns, or to allow a fully configurable set of calcs (I'd advise the first as the second can end up with a horrid UI).

回复收藏 0 原文

征﹌骨岁月お 2024-10-21 18:05:36

如果用户可以创建视图，为什么不简单地GRANT CREATE VIEW TO user_role。他们可以编写自己的计算。然后他们将拥有 SQL 视图的全部功能。你或我能想到的任何东西都将是：

其中的一小部分，以及
需要额外的表和维护

实施

鉴于那么你必须实施它。

最好的方法仍然是尽可能少地使用应用程序代码，并尽可能多地使用 SQL 和 SQL 工具。解析输入请求并构建查询已经足够困难了，您不想每次提出新请求时都必须进行更改。

你所拥有的东西是相当硬编码且脆弱的；它们似乎是只做一件事的汇总表。相反，在数据库中构建向量。例如，维度-事实结构是数据仓库中常见的做法，而不是汇总表。如果您发布一些 DDL（3 或 4 个关键表用作源，用于您问题中的“视图”），我可以提供更具体的指示。

当然，如果您的数据库不是数据库，那么将无法实现数据库和 SQL 所期望的基本功能。有一半的时间，这些归档系统只需要重新实现，从非数据库提供普通功能并维护它们的成本远远高于重新实现的成本。

另一半时间，可以通过实现一些 SQL 视图来克服瘫痪的非数据库问题，每个视图都提供规范化表的功能。但这需要仔细分析，在做出是否可能的决定之前，这在 SO 上是不可能的。接下来，您可以在视图上构建向量或维度，就像在真实数据库中的表上一样。

每个向量或维度都是一个标量子查询，填充单个列。外部查询定义结果集（网格）的结构。子查询填充单元格。

最后，在您的应用程序中，它现在可以是一个更简单的解析器，给定输入数据，它可以获取一个或两个向量，或者一个向量除以另一个向量等。构建 SELECT 语句，然后 EXECUTE。

If the users can create Views, why don't you simply GRANT CREATE VIEW TO user_role. They can write their own calculations. Then they will have the full capability of SQL Views. Anything you or I could come up with will be:

a fraction of that, and
require additional tables and maintenance

Implementation

Given that you have to implement it, then.

The best method is still to do as little in your app code, and as much in SQL and SQL facilities, as possible. Parsing the input request, and building the query is difficult enough, and you don't want to have to change that every time they come up with a new request.

What you have is fairly hard-coded and brittle; they appear to be summary tables which do only one thing. Build Vectors in the database, instead. As in, the Dimension-Fact structures that are common practice in data warehouses, as opposed to summary tables. If you post some DDL (3 or 4 key tables being used as source, for the "views" in your question), I can provide more specific directions.

Of course, if your database is not a database, that will prevent basic functions that can be expected from databases and SQL. Half the time, these filing systems just need to be re-implemented, the cost of struggling to provide ordinary functions from non-databases and to maintain them is far more than the re-implementation cost.

The other half of the time, the crippled non-database can be overcome by implementing a few SQL Views, each of which provide the facility of a Normalised table. But that needs careful analysis, not possible here on SO, before the decision as to whether that is possible, can be made. Following that, you can then build your Vectors or Dimensions on the Views, as you would on the tables in a real database.

Each Vector or Dimension is a Scalar Subquery, populating a single column. The outer query defines the structure of the result set (grid). The subqueries populate the cells.

Last, in your app, which can now be a simpler parser, given the input data, it can grab a vector, or two, or a vector divided by another vector, etc. Build the SELECT statement, and then EXECUTE.

回复收藏 0 原文

~没有更多了~