如何正确构建物联网传感器数据库模型？

发布于 2025-01-19 06:55:45 字数 2375 浏览 3 评论 0原文

我正在从事一个车间设备数据收集项目，旨在分析历史生产订单和实时数据（靠近操作员的 HMI）。

数据库实际状态：

数据从不同的设备（不同的协议）中提取出来，放入SQL Server中，结构如下：

PROCESS表：作为主表，每当一个批次（生产单位））启动后，会创建一个 ProcessID 以及各种信息：
<表类=“s-表”> <标题> 进程ID 房间设备ID 批次ID 程序操作员开始结束 <正文> 209486 房间1 1010 985985 食谱A 吉姆 2022.04.05 13:58:02 2022.04.05 15:58:02
设备系列表：为每个设备系列（搅拌机、烤箱等）创建一个表，其中每 5 秒收集一次其传感器值（湿度、温度、速度等）。以下是上面的 BatchID 示例，其中 ProcessID = 设备系列表上的混合 ID - dbo.Mixer：
<表类=“s-表”> <标题> 混合ID 设备ID 湿度温度速度日期时间 <正文> 209486 1010 2.5 70 250 2022.04.05 13:58:02 209486 1010 2.6 73 215 2022.04.05 13:58:07 .... ... .... .... .... ....

因此，数据库由一个主 PROCESS 表和几个设备系列表构成，这些表分别是在项目开发期间创建（dbo.mixer、dbo.oven 等）。有以下数据流：SQLServer（源）- RDS 服务器- Power BI。

实际状况及存在的问题随着

项目的发展，出现了 2 个问题：

手动工作：在源中插入新表和列（在现有表中）导致需要在 RDS 服务器和 Power 中进行手动更改商业智能。每次开发新的设备通信并且是新的设备系列时，都会创建一个新表，或者如果我们需要在现有表中引入新传感器，因为传感器是表的标题。
实时数据实际的架构使得实时仪表板的实现变得困难。

带着这两大问题我们目前分析新的系统架构应该是： SQLServer（源）- DataLake - Snowflake（数据仓库）- Power BI（或任何应用程序）。 但是，这并不能解决 1) 中定义的手动工作。对于这个问题，我们希望将源重组为仅 2 个表：进程（相等）和传感器表（新）。这个新表将是一个狭窄的大表，其中包含所有不同设备传感器（超过 60 个设备）的数十亿时间戳，结构如下：。 dbo.Sensors:

ProcessId	EquipmentID	SensorID	SensorValue	DateTime
209486	1010	1	2.5	2022.04.05 13:58:02
209486	1010	2	70	2022.04.05 13:58:02
209486	1010	3	250	2022.04.05 13:58:02

具有相应的传感器尺寸表（可以在 DataWarehouse 中创建）：

SensorID	EquipmentID	SensorName	SensorUnit
1	1010	湿度	%
2	1010	温度	℃
3	1010	速度	rpm

那么，这是重组源和数据的更好方法吗？创建这个巨大的高桌子而不是继续当前的结构？至少它会解决新表或新列输入的问题。另一方面，由于不断插入越来越多的设备和传感器，这张桌子的尺寸将变得巨大。

希望有人能为我们指出正确的方向。

原文

I'm working on a shop floor Equipment Data Collection project which aims to analyze production orders historically and with real-time data (HMI close to the operator).

Actual database status:

Data is extracted from different equipment (with different protocols) and placed in an SQL server with the following structure:

PROCESS table: As the main table, whenever a batch (production unit) is started, a ProcessID is created as well as varied information:
ProcessID Room EquipmentID BatchID Program Operator Start End
209486 Room1 1010 985985 RecipeA Jim 2022.04.05 13:58:02 2022.04.05 15:58:02
Equipment family table: For each equipment family (mixers, ovens etc.) a table is created in which its sensor values (humidity, temperature, speed etc.) are collected every 5 seconds. Here is an example the BatchID above, where ProcessID = Mix ID on the equipment family table - dbo.Mixer :
MixID EquipmentID Humidity Temperature Speed DateTime
209486 1010 2.5 70 250 2022.04.05 13:58:02
209486 1010 2.6 73 215 2022.04.05 13:58:07
.... .... .... .... .... ....

ProcessID	Room	EquipmentID	BatchID	Program	Operator	Start	End
209486	Room1	1010	985985	RecipeA	Jim	2022.04.05 13:58:02	2022.04.05 15:58:02

MixID	EquipmentID	Humidity	Temperature	Speed	DateTime
209486	1010	2.5	70	250	2022.04.05 13:58:02
209486	1010	2.6	73	215	2022.04.05 13:58:07
....	....	....	....	....	....

So, the database is structured with a main PROCESS table and several equipment family tables that are being created during the project development (dbo.mixer, dbo.oven etc).
have the following data flow: SQLServer(source) - RDS Server - Power BI.

Problems of actual status & doubts

With the development of the project, 2 problems arise:

MANUAL WORK: Insertion, in the source, of new tables and columns (in existing tables) causes the need of manual alteration in the RDS server and in Power BI. Every time a new equipment communication is developed and is a new equipment family, a new table is created or if we need to introduce a new sensor in an existing table since the sensors are headers of the table.
Real-time data The actual architecture makes it difficult to implement real time dashboarding.

With these two big problems we are currently analyzing that the new system architecture should be:
SQLServer(source) - DataLake - Snowflake(DataWarehouse) - Power BI (or any application).
However, this won't solve the manual work defined in 1). For this problem we are looking to restructure the source to just 2 tables: Process (equal) and Sensors table(new). This new table would be a narrow big big big table with billions of timestamps of all the different equipment sensors (over 60 equipment), structured as follows:
. dbo.Sensors:

ProcessId	EquipmentID	SensorID	SensorValue	DateTime
209486	1010	1	2.5	2022.04.05 13:58:02
209486	1010	2	70	2022.04.05 13:58:02
209486	1010	3	250	2022.04.05 13:58:02

with a corresponding Sensor Dimension Table (could be created at DataWarehouse) :

SensorID	EquipmentID	SensorName	SensorUnit
1	1010	Humidity	%
2	1010	Temperature	ºC
3	1010	Speed	rpm

So, would it be a better way to restructure source and create this giant tall table rather than continuing current structure? At least it will solve the problem of new table or new columns input.
On the other hand, the size of this table will be enormous given that more and more equipment and more sensors are continually being inserted.

Hoping someone might point us in the right direction.

分享到QQ

分享到微博