Oracle“伪事实”看法

发布于 2024-11-07 17:00:09 字数 979 浏览 1 评论 0 原文

假设:

  1. 我有许多由事实和外键(“维度”和“键值”类型)组成的表。例如,ENCOUNTER:

ID - 主键

维度

  • LOCATION_ID
  • PATIENT_ID

键值

  • TYPE_ID
  • STATUS_ID
  • PATIENT_CLASS_ID
  • DISPOSITION_ID
  • ...

事实

  • ADMISSION_DATE
  • DISCHARGE_DATE
  • 。 ..

    1. 我没有创建数据仓库的选项
    2. 我想简化报告的数据结构

我的方法是创建许多伪维度视图(“D_LOCATION”基于 DEPARTMENT 和LOCATION 表)和伪事实视图(基于 ENCOUNTER 表的“F_ENCOUNTER”)。在伪事实视图中,我会将键值表(例如 STATUS、PATIENT_CLASS)连接到事实表以包含名称字段(例如 STATUS.NAME、PATIENT_CLASS.NAME)。

问题:

  • 如果查询从 F_ENCOUNTER 选择所有字段的子集(即不是所有 key-value.name 字段),Oracle 10g 优化器是否足够智能以排除某些键值表连接(即未包含在查询中)?
  • 我可以做些什么来优化这个架构(除了索引之外)
  • 还有其他方法吗?

** 编辑 ** 目标(按重要性排序):

  • 降低查询复杂度;增加查询一致性;减少报告开发时间
  • 优化查询处理
  • 最大限度地减少管理员负担
  • 减少存储

Assumptions:

  1. I have a number of tables comprised of facts and foreign keys ('dimensional' and 'key-value' type). For example, ENCOUNTER:

ID - primary key

dimensions

  • LOCATION_ID
  • PATIENT_ID

key-value

  • TYPE_ID
  • STATUS_ID
  • PATIENT_CLASS_ID
  • DISPOSITION_ID
  • ...

facts

  • ADMISSION_DATE
  • DISCHARGE_DATE
  • ...

    1. I don't have the option to create a data warehouse
    2. I would like to simplify the data structure for reporting

My approach is to create a number of pseudo-dimensional views ('D_LOCATION' based on the DEPARTMENT and LOCATION tables) and pseudo-fact views ('F_ENCOUNTER' based on ENCOUNTER table). In the pseudo-fact view, I would JOIN the key-value tables (e.g. STATUS, PATIENT_CLASS) to the fact table to include the name fields (e.g. STATUS.NAME, PATIENT_CLASS.NAME).

Questions:

  • If a query selects a subset of all of the fields from F_ENCOUNTER (i.e. not all of the key-value.name fields), is the Oracle 10g optimizer smart enough to exclude some of the key-value table joins (i.e. the ones that aren't included in the query)?
  • Is there anything that I can do to optimize this architecture (other than indices)
  • Is there another approach?

** edit **
Goals (in order of importance):

  • reduce query complexity; increase query consistency; decrease report-development time
  • optimize query-processing
  • minimize administrator burden
  • decrease storage

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

孤檠 2024-11-14 17:00:09

一种优化建议是不要使用键值对表。维度表的概念是,每条记录都应包含有关该概念的所有信息,而无需连接到规范化表 - 即,将星型模式转变为雪花型模式。

虽然值可能会在维度表记录中重复,但它的优点是报告查询中的联接较少。以这种方式对表进行非规范化可能看起来违反直觉,但在性能至关重要的情况下,它通常是最佳解决方案。

One optimization suggestion is not to use key-value pair tables. The concept of a Dimension table is that each record should contain all information about that concept without needing to join to normalized tables - i.e. turning a star schema into a snowflake schema.

While values might be repeated across dimension table records, it has the advantage of fewer joins in your reporting queries. Denormalizing tables in this way might seem counter intuitive but where performance is paramount it is usually the best solution.

倦话 2024-11-14 17:00:09
  • 我不相信 Oracle 会排除在视图中完成的任何联接,因为联接会影响返回的行数。 (当内部联接无法匹配任何行时,会使整个结果集为空。)
  • 您的优化目标是什么?查询速度?查询简单吗?存储效率?如果您可以牺牲存储效率来获得更好的查询性能,请将键值引用替换为值本身(TYPE_NAME 而不是 TYPE_IDPATIENT_CLASS_NAME而不是 PATIENT_CLASS_ID 等)。
  • [编辑:]如果原始架构无法修改,请考虑使用物化视图。它本质上会预先计算连接并存储结果集,从而为您提供快速的查询时间,但代价是额外的存储空间和可能不新鲜的数据。您可以通过指定适当的刷新策略来控制后者。请参阅 http://en.wikipedia.org/wiki/Materialized_viewhttp://download.oracle.com/docs/cd/B10500_01/server.920/a96520/mv.htm 了解更多详细信息。
  • I don't believe Oracle would exclude any joins done in the view, because the joins can impact the number of rows returned. (As when an inner join fails to match any rows, making the whole result set empty.)
  • What are the goals of your optimization? Query speed? query simplicity? storage efficiency? If you can sacrifice storage efficiency for better query performance, then replace the key-value references with the values themselves (TYPE_NAME instead of TYPE_ID, PATIENT_CLASS_NAME instead of PATIENT_CLASS_ID, etc.).
  • [Edit:] If the original architecture cannot be modified, consider using a materialized view. It would essentially pre-compute the joins and store the result set, giving you speedy query time at the cost of extra storage space and possibly-not-fresh data. You can control the latter by specifying an appropriate refresh policy. See http://en.wikipedia.org/wiki/Materialized_view and http://download.oracle.com/docs/cd/B10500_01/server.920/a96520/mv.htm for further details.
~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文