动态数据库结构

发布于 2024-10-06 15:34:02 字数 473 浏览 2 评论 0原文

我想要一些关于特定问题的数据库/编程建议。

我有 5 个不同的人(生活在世界不同地区)为我提供数据。这些数据是按照标准结构布局以多种方式提供给我的。然而,它并不总是协调的,数据可能有标准之外的额外内容,所以我希望结构尽可能动态,以适应人们想要使用的内容。

然后,这 5 个数据源被放置在我托管的中央数据库中。所以基本上我有 5 个数据源,它们按照标准结构进行格式化,并且它们被上传到我的本地数据库。

我希望为提供数据的人尽可能自动上传这些数据,因此我希望他们上传自动插入到我的本地数据库中的新数据集。

我的问题是:

  1. 我应该如何保持结构动态,而不必重新访问我的标准布局来适应新的数据字段或不同的结构?
  2. 如何让他们以增量方式上传数据?例如,他们可能上传其数据的 XML 版本,我的上传代码应该找出已经存在的内容。
  3. 我的最后也是最重要的问题。除了上传基础设施之外,是否有更好的方法来解决这个问题?

I would like some database/programming suggestion on a specific issue.

I have 5 different people (that live in different parts of the world) that provide me with data. This data is given to me in many variety of ways, following a standard structure layout. However it's not always harmonized, the data might have extra things that are not in the standard, so I'd like the structure to be as dynamic as possible to accommodate what the person wants to use.

These 5 data sources are then placed inside a central database I host. So basically I have 5 data sources that are formatted following a standard structure, and they are uploaded to my local database.

I want to automate the upload of this data as much as possible for the person providing the data, so I want them to upload new sets of data that are automatically inserted in my local db.

My questions are:

  1. How should I keep the structure dynamic without having to revisit my standard layout to accommodate new fields of data, or different structure?
  2. How do I make them upload data in a way that is incremental? For example they might be uploading an XML version of their data, my upload code should figure out what already exists.
  3. My final and most important question. Are there better ways of going about this instead of having an upload infrastructure?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

请爱~陌生人 2024-10-13 15:34:02

我应该如何保持结构动态,而不必重新访问我的标准布局来适应新的数据字段或不同的结构?

基本上,您可以旋转列和行的正常数据库概念。

您有一个数据名称表,其中包含数据字段的唯一名称,以及一个指示符,用于告诉导入进程存储的数据类型,例如日期、时间戳或整数。

您有一个数据表,其中包含数据名称 id、序列号、数据字段和用于标识信息的外键。

序列号用于区分同一数据名称的不同值。

数据字段包含所有可能的数据类型。在大多数数据库中,这将是 VARCHAR(MAX)。由上传过程将日期和数字转换为字符串。

您的数据表将具有一个指向其余信息的外键,用于标识数据字段属于谁。

如何让他们以增量方式上传数据?例如,他们可能上传其数据的 XML 版本,我的上传代码应该找出已经存在的内容。

简短的回答是你不能。

您的上传过程必须识别重复数据,而不是将其存储在数据库中。

我的最后也是最重要的问题。除了上传基础设施之外,是否有更好的方法来解决这个问题?

如果不了解更多有关您接收的数据类型的信息,这是一个很难回答的问题,但是有一些软件可以让您通过定义输入数据结构并将该结构映射到数据库表来加载数据库,而无需进行大量编程。

How should I keep the structure dynamic without having to revisit my standard layout to accommodate new fields of data, or different structure?

Basically, you pivot the normal database idea of columns and rows.

You have a data name table, which consists of the unique names of the fields of data, and an indicator to tell the import process what type of data is stored, like a date, timestamp, or integer.

You have a data table, which contains the data name id, a sequence number, the data field, and a foreign key to identifying information.

The sequence number is used to differentiate between different values of the same data name.

The data field holds every type of data possible. This would be a VARCHAR(MAX) in most databases. It's up to the upload process to convert dates and numbers to strings.

Your data table will have a foreign key to the rest of the information that identifies who the data field belongs to.

How do I make them upload data in a way that is incremental? For example they might be uploading an XML version of their data, my upload code should figure out what already exists.

The short answer is that you can't.

Your upload process has to identify duplicate data and not store it on the database.

My final and most important question. Are there better ways of going about this instead of having an upload infrastructure?

This is a hard question to answer without knowing more about the type of data you're receiving, but there is software that allows you to load databases without a lot of programming, by defining the input data structure and mapping that structure to your database tables.

寄居者 2024-10-13 15:34:02

这是一个非常普遍的问题,但我想我有一个普遍的答案。我认为解决您的问题的是构建一个新的关系演算,其中附加到主记录的属性不是预先确定的。这是涉及电话簿应用程序的示例。

使用非关系表的常见方法:

  • 表 PERSON 有列 Name、
    家庭电话、办公电话。

一切都很好,但是如果偶尔有人带着移动电话、多于一部移动电话、传真电话等出现,您会怎么做。

相反,您要做的是:

  • 表 Person 有列 Person_ID,
    姓名。
  • Table Phones 具有 Person_ID 列,
    电话类型、电话号码。

Person 和 Phones 之间存在一对多关系,并且可以有从零到无数的任意数量。这些表通过 Person_ID 连接。您必须具有枚举 Phone_Type 列的业务和表示逻辑(或者只是让它成为自由格式,这不是那么有用但更容易)。

您可以对任何属性执行此操作,这就是关系数据库的全部内容。我希望这有帮助。

This is a very general question, but I think I have a general answer. What I think solves your problem is to construct a new relational calculus where the properties attached to the master record are not pre-determined. Here is an example involving a phone book application.

Common method using a non-relational table:

  • Table PERSON has columns Name,
    HomePhone, OfficePhone.

All well and good, but what do you do if the occasional person shows up with a mobile phone, more than one mobile phone, a fax phone, etc.

Instead what you do is:

  • Table Person has columns Person_ID,
    Name.
  • Table Phones has columns Person_ID,
    Phone_Type, PhoneNumber.

There is a one-to-many relationship between Person and Phones, and there can be any number of them from zero to a zillion. The tables are JOINed by Person_ID. You have to have business and presentation logic that enumerates the Phone_Type column (or just let it be free-form, which is not as useful but easier).

You can do that for any property, and is what relational data bases are all about. I hope this helps.

脸赞 2024-10-13 15:34:02

正如其他人所说,EAV 表可以处理动态结构。 (注意大型表的性能问题)

但是让客户端指定数据库字段符合您的利益吗?您无法编写业务逻辑来作用于这些新字段,因为它们还不存在,它们可以是任何东西。

你能强迫客户遵守你的模型吗?这使您可以提前了解字段并让业务逻辑作用于这些字段。它还允许您编写有意义的报告,而不仅仅是旋转数据转储。

As others have said, EAV tables can handle dynamic structure. (be aware of performance issues on large tables)

But is it in your interest to have your database fields dictated by the client? You can't write business logic to act upon those new fields because they don't exist yet, they could be anything.

Can you force the client to conform to your model? This allows you to know the fields ahead of time and have business logic act upon the fields. It allows you to write meaningful reports as well, rather than just pivoted data dumps.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文