历史聚合的数据库设计

发布于 2024-12-22 04:09:02 字数 616 浏览 2 评论 0原文

我正在尝试确定哪种是最好的数据仓库类型设计。它将用于使用 Google 类型搜索来查找不同时间段内不同商品的历史平均价格。例如,股票 A 本月、300 万、600 万和 1 年前的平均价格是多少?问题是我没有可以使用的项目名称,我有关于该项目的描述字段。

这意味着我无法将项目聚合到视图中,因为同一个项目可能会列出 20 次,每次都有不同的描述,所以我必须在描述字段上进行全文搜索,获取插入日期所在的价格< 3个月前。然后求出其平均值。

因此,我最好的选择是将所有内容都放在一张表中,例如:

            MAIN
----------------------------
ID | Description | Price | Date

或许多表:

    DESCRIPTION
------------------
ID | Description | 


   PRICE
---------
ID | PRICE

并且只需加入即可获取我想要的数据。该数据库将包含几百万行。如果我有办法获得该项目的真实名称,我可以看到预先聚合数据,但这对我来说不是一个选择。我很感激任何建议!

I am trying to decide which would be the best data warehouse type design. It will be used to find historical price averages of different item during different time periods using a Google type search. For example, what was the avg price of Stock A this month, 3m, 6m, and 1 year ago? The issue is that I do not have an item name that I can use, I have descriptions fields about the item.

This means that I can't aggregate items into views, since the same item maybe listed 20 times each with different descriptions, so I have to on the fly do a full-text search on the description field, grab the price where the insertdate is < 3 months ago. Then find the average of that.

So is my best bet to have everything in one table like:

            MAIN
----------------------------
ID | Description | Price | Date

or many tables:

    DESCRIPTION
------------------
ID | Description | 


   PRICE
---------
ID | PRICE

And just join to get the data I want. The database will contain a few million rows. If I had a way to get the real name of the item I could see pre aggregating the data, but that is not an option for me. I appreciate any advice!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

那小子欠揍 2024-12-29 04:09:02

我想说选项 2 ...将顶级详细信息保留在“描述”表中。以及“价格”表中的历史数据(尽管添加了日期字段以捕获时间值)

I'd say option 2 ... keep the top level details in the "description" table. And the historic data in the "price" table (albeit, with a Date field added to capture the temporal value)

面犯桃花 2024-12-29 04:09:02

正如 Joel 所建议的,选项 2 可能会为您提供更大的灵活性。我建议在每个表中添加额外的日期,以适应缓慢变化的维度。关于给定项目的描述和其他属性可能会随着时间的推移而改变。

对于实体零售商,您很可能还包含商店 ID,因为由于给定位置附近客户的竞争和人口构成,不同位置的商品价格很可能不同。

    DESCRIPTION
---------------------------------------------------
ID | Description | Effective Date | Expiration Date


   PRICE
-----------------------------------------------------------
ID | Location ID | Price | Effective Date | Expiration Date

As Joel suggested, Option 2 is likely going to provide you more flexibility. I would suggest including additional dates in each table to accomodate for slowly changing dimensions. Descriptions and other attributes about a given item may change over time.

In the case of a brick and mortar retailer, you would quite likely include the Store ID as well because items are quite likely priced differently in different locations due to competition and demographic make-up of your customers near a given location.

    DESCRIPTION
---------------------------------------------------
ID | Description | Effective Date | Expiration Date


   PRICE
-----------------------------------------------------------
ID | Location ID | Price | Effective Date | Expiration Date
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文