如何将多个数据库中的数据汇集在一起​​?

发布于 2024-11-06 04:11:48 字数 2924 浏览 0 评论 0原文

背景:

我应该在开头说我不是试图让别人为我做我的工作。我觉得我正处于一个十字路口,有多种方法可以实现我的目标,但我不确定哪些方法是“标准”和/或我是否缺乏相对有限的知识。

我的系统已经发展了六个月,自 1 月 11 日以来,数据库模式一直非常稳定。 (我从来不确定我是否犯了一个重大错误,为每个月创建一个数据库以与每月会计周期相对应,但我只是不知道如何做其他事情

现在:

我的老板要求我创建年初至今的报告,其中包含所有每月数据库的记录。

我开始做什么:

我整理了一个元数据模式,并用足够的信息填充它,以便我可以编写一个应用程序来执行 ETL 操作。

它看起来像这样:

USE [DAMain1]
GO
CREATE TABLE AccountingPeriod (
    Id INT PRIMARY KEY NOT NULL,
    Name VARCHAR(255) NOT NULL UNIQUE,
    DateStart DATE NOT NULL,
    DateStop DATE NOT NULL
)
GO
INSERT INTO AccountingPeriod VALUES 
     (1, 'Jan11', '1/1/2011', '1/31/2011')
    ,(2, 'Feb11', '2/1/2011', '2/28/2011')
    ,(3, 'Mar11', '3/1/2011', '3/31/2011')
    ,(4, 'Apr11', '4/1/2011', '4/30/2011')
    ,(5, 'May11', '5/1/2011', '5/31/2011')

CREATE TABLE [DBServer] (
    Id INT PRIMARY KEY NOT NULL,
    Name VARCHAR(255) NOT NULL UNIQUE
)
GO
INSERT INTO DBServer VALUES
    (1, 'Aaron.directagents.local')
GO
CREATE TABLE [DBInstance] (
     Id INT PRIMARY KEY NOT NULL
    ,DBServerId int NOT NULL REFERENCES DBServer(Id)
    ,SchemaName VARCHAR(255) NOT NULL
    ,CatalogName VARCHAR(255) NOT NULL
    ,ConnectionString VARCHAR(2000) NOT NULL
)
GO
INSERT INTO DBInstance VALUES
     (1, 1, 'dbo', 'DADatabaseR2', 'Data Source=aaron\sqlexpress;Initial Catalog=DADatabaseR2;Integrated Security=True')
    ,(2, 1, 'dbo', 'DADatabaseR3', 'Data Source=aaron\sqlexpress;Initial Catalog=DADatabaseR3;Integrated Security=True')
    ,(3, 1, 'dbo', 'DADatabaseMarch11', 'Data Source=aaron\sqlexpress;Initial Catalog=DADatabaseMarch11;Integrated Security=True')
    ,(4, 1, 'dbo', 'DADatabaseApr11', 'Data Source=aaron\sqlexpress;Initial Catalog=DADatabaseApr11;Integrated Security=True')
GO
CREATE TABLE DADB (
    Id int PRIMARY KEY NOT NULL,
    Name VARCHAR(255) NOT NULL UNIQUE,
    AccountingPeriodId int NOT NULL REFERENCES AccountingPeriod(Id),
    DBInstanceId INT NOT NULL REFERENCES DbInstance(Id)
)
GO
INSERT INTO DADB VALUES
     (1, 'Direct Agents Database for January 2011', 1, 1)
    ,(2, 'Direct Agents Database for February 2011', 2, 2)
    ,(3, 'Direct Agents Database for March 2011', 3, 3)
    ,(4, 'Direct Agents Database for April 2011', 4, 4)
GO
CREATE VIEW DADBs AS
SELECT
     DA.Name [Database]
    ,AP.Name [Accounting Period]
    ,AP.DateStart [Start]
    ,AP.DateStop [Stop]
    ,DS.Name [Server]
    ,DI.SchemaName
    ,DI.CatalogName
    ,DI.ConnectionString [Connection]
FROM
    DADB DA
    INNER JOIN AccountingPeriod AP ON DA.AccountingPeriodId=AP.Id
    INNER JOIN DBInstance DI ON DA.DBInstanceId=DI.Id
    INNER JOIN DBServer DS ON DI.DBServerId=DS.Id
GO
SELECT * FROM DADBs
GO

问题:

我不知道这是否是合理/正常的处理方式。我有足够的时间来完成一件事,但我无法自己弄清楚该走哪条路。

问题:正如我所解释的,鉴于我需要提取行项目数据并聚合多个数据库,是否有其他方法可以定义驱动自定义 ETL 解决方案的元数据表? (就我的目的而言,C# 应用程序和 SSIS 项目是 eqiv,但我有兴趣知道是否可以在此处使用 Analysis Services 或 Reporting Services)

BACKGROUND:

I should preface this by saying I'm not trying to get someone to do my work for me. I feel like I'm at a bit of a crossroad where there are multiple ways to get to my goal, but I'm not sure which ones are 'standard' and/or if my relatively limited knowledge is lacking.

I've got a system that's been evolving for six months now, and since Jan 11, the DB schema has been pretty stable. (I was never sure if I was making a major mistake creating a database for each month to correspond with monthly accounting cycles, but I just didn't have the know-how to do otherwise)

NOW:

My boss is asking me to create year to date reports consisting of records from all the monthly databases.

WHAT I STARTED DOING:

I put together a meta data schema and populated it with enough information that I could write an application to perform ETL operations.

Here's what it looks like:

USE [DAMain1]
GO
CREATE TABLE AccountingPeriod (
    Id INT PRIMARY KEY NOT NULL,
    Name VARCHAR(255) NOT NULL UNIQUE,
    DateStart DATE NOT NULL,
    DateStop DATE NOT NULL
)
GO
INSERT INTO AccountingPeriod VALUES 
     (1, 'Jan11', '1/1/2011', '1/31/2011')
    ,(2, 'Feb11', '2/1/2011', '2/28/2011')
    ,(3, 'Mar11', '3/1/2011', '3/31/2011')
    ,(4, 'Apr11', '4/1/2011', '4/30/2011')
    ,(5, 'May11', '5/1/2011', '5/31/2011')

CREATE TABLE [DBServer] (
    Id INT PRIMARY KEY NOT NULL,
    Name VARCHAR(255) NOT NULL UNIQUE
)
GO
INSERT INTO DBServer VALUES
    (1, 'Aaron.directagents.local')
GO
CREATE TABLE [DBInstance] (
     Id INT PRIMARY KEY NOT NULL
    ,DBServerId int NOT NULL REFERENCES DBServer(Id)
    ,SchemaName VARCHAR(255) NOT NULL
    ,CatalogName VARCHAR(255) NOT NULL
    ,ConnectionString VARCHAR(2000) NOT NULL
)
GO
INSERT INTO DBInstance VALUES
     (1, 1, 'dbo', 'DADatabaseR2', 'Data Source=aaron\sqlexpress;Initial Catalog=DADatabaseR2;Integrated Security=True')
    ,(2, 1, 'dbo', 'DADatabaseR3', 'Data Source=aaron\sqlexpress;Initial Catalog=DADatabaseR3;Integrated Security=True')
    ,(3, 1, 'dbo', 'DADatabaseMarch11', 'Data Source=aaron\sqlexpress;Initial Catalog=DADatabaseMarch11;Integrated Security=True')
    ,(4, 1, 'dbo', 'DADatabaseApr11', 'Data Source=aaron\sqlexpress;Initial Catalog=DADatabaseApr11;Integrated Security=True')
GO
CREATE TABLE DADB (
    Id int PRIMARY KEY NOT NULL,
    Name VARCHAR(255) NOT NULL UNIQUE,
    AccountingPeriodId int NOT NULL REFERENCES AccountingPeriod(Id),
    DBInstanceId INT NOT NULL REFERENCES DbInstance(Id)
)
GO
INSERT INTO DADB VALUES
     (1, 'Direct Agents Database for January 2011', 1, 1)
    ,(2, 'Direct Agents Database for February 2011', 2, 2)
    ,(3, 'Direct Agents Database for March 2011', 3, 3)
    ,(4, 'Direct Agents Database for April 2011', 4, 4)
GO
CREATE VIEW DADBs AS
SELECT
     DA.Name [Database]
    ,AP.Name [Accounting Period]
    ,AP.DateStart [Start]
    ,AP.DateStop [Stop]
    ,DS.Name [Server]
    ,DI.SchemaName
    ,DI.CatalogName
    ,DI.ConnectionString [Connection]
FROM
    DADB DA
    INNER JOIN AccountingPeriod AP ON DA.AccountingPeriodId=AP.Id
    INNER JOIN DBInstance DI ON DA.DBInstanceId=DI.Id
    INNER JOIN DBServer DS ON DI.DBServerId=DS.Id
GO
SELECT * FROM DADBs
GO

PROBLEM:

I don't know if this is a reasonable/normal way of going about it. I have enough time to ramp up on one thing, but I can't figure out on my own what path to go down.

QUESTION: Given that I need to pull line item data and aggregate over multiple databases as I explained, are there alternatives to defining meta data tables that drive custom ETL solutions? (for my purposes a C# app and a SSIS project are eqiv, but I'm interested to know if one might use Analysis Services or Reporting Services here)

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

尬尬 2024-11-13 04:11:48

糟糕的数据库设计常常会在报告中出现。正如您所发现的,将每个月的数据存储在单独的数据库中会造成报告噩梦。想象一下如果会计周期日期发生变化会发生什么?更好的解决方案是将数据合并到一个数据库中,您可以在其中确定基于会计周期的条目属性(输入日期、过帐日期等)。

在此期间,鉴于您拥有的资源,我认为最好的解决方案是创建一个统一数据库并使用其他数据库中的 SSIS 填充它,直到您可以更新中间层或 UI 以使用统一设计。

Bad database designs often rear themselves in reporting. As you have discovered, having the data for each month in separate databases has created a reporting nightmare. Imagine what would happen if the accounting cycle dates change? The better solution would be to consolidate the data into a single database where you determine the accounting cycle based attributes of the entries (date entered, date posted etc.).

In the interim, given what you have, I'd say the best solution is to create a consolidated database and fill it using SSIS from the other databases until you can update the middle-tiers or UIs to use the consolidated design.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文