Oracle 数据版本控制/分区策略/最佳实践

发布于 2024-08-19 07:05:34 字数 361 浏览 9 评论 0原文

不确定该主题是否完全传达了我想要实现的目标,但让我解释一下:

我们正在构建一个使用 Oracle 作为存储后端的应用程序。每年,去年的数据集将被“存档”,并从头开始创建和填充一个新实例。 在同一模式中执行此操作的选项有哪些?

  1. 将版本信息保持在创纪录的水平(我们认为这对于我们的用例来说太慢了)。
  2. 将版本信息保留在表级别,因此对于每个新版本,我们将重新创建所有表,但使用新版本前缀。 (我们喜欢这个解决方案,因为我们可以用代码完成这一切)。

难道没有像分区/个性/命名空间这样的东西可以让我们在 Oracle 中实现这一点吗?

我的oracle经验相当有限,任何帮助将不胜感激!

not sure if the subject entirely conveys what I'm trying to achieve, but let me explain:

We are building an application that uses Oracle as storage backend. Each year, last years dataset will be "Archived", and a new instance created and populated from scratch.
What are the options to do this within the same schema?

  1. Keep version information on a record level (we presume this will be too slow for our use-case).
  2. Keep version information on a table level, so for each new version, we will re-create all the tables but with a new version prefix. (We like this solution, since we can do it all in code).
  3. ?

Is there not something like partitions/personalities/namespaces available that will allow us to achieve this in Oracle?

My oracle experience is rather limited, any assistance will be greatly appreciated!

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

无敌元气妹 2024-08-26 07:05:34

RDBMS 概念模型不太擅长维护数据的时间版本。所以在这方面缺乏的不仅仅是Oracle。

我不清楚为什么您认为将版本信息保留在记录级别会太慢。创建新版本太慢?或者在常规操作期间数据检索速度太慢?

以下是您可以如何做到这一点。给定一个业务键为 CUSTOMER_REF 的 CUSTOMERS 表,我通常可能会像这样构建它(出于空间原因,我使用缩写语法而不是最佳实践):

create table customers 
( id number not null primary key
  , customer_ref number not null unique key
  , name varchar2(30) not null )
/

版本化的等效项将如下所示:

create table customers 
( id number not null primary key
  , customer_ref number not null 
  , version_number number
  , name varchar2(30) not null
  , constraint whatever unique (customer_ref, version_number) )
/

这通过保留 VERSION_NUMBER 的当前版本来工作null,并且仅在归档时填充它。任何查找都必须包含 并且 version_number 为 null。这会有点麻烦,您可能需要将该列包含在您构建的任何其他索引中。

显然,在同一个表中维护所有版本的记录会增加表的大小,这可能会影响性能。 Oracle 的分区选项绝对可以提供帮助。它还将为您提供一种创建明年数据集的巧妙方法。然而,它是在企业许可证之上额外收费的,因此它是一个昂贵的选择。 了解更多信息。

其中最耗时的部分是管理新版本表中的外键关系。假设您选择使用合成主键,归档过程将必须生成新的 ID,然后煞费苦心地将它们级联到引用外键的新版本中的依赖记录。

考虑到这一点,每个版本的谨慎表格都显得非常有吸引力。为了便于使用,我将保持当前版本不带前缀,以便归档成为一个简单的过程。

create table customers_n as select * from customers; 

您可能希望在创建版本化表时避免停机。在这种情况下,您可以使用物化视图来捕获归档切换过程中表的状态。当时钟敲响十二点时,您可以关闭刷新。 (警告:这是即时思考的,我从来没有做过这样的事情,所以在购买之前先尝试一下。)

多个表(和分区)的一个相关优点是您可以将存档记录移动到只读表空间。这不仅可以防止它们受到不必要的更改,还意味着您可以将它们从后续备份中排除。

编辑

我注意到您评论说存档的数据有时可以修改。在这种情况下,将其移动到只读表空间并不是可行的。

The RDBMS conceptual model is not very good at maintaining temporal versions of data. So it is not just Oracle which is lacking in this regard.

I am unclear why you think keeping version information at the record level will be too slow. Too slow in creating a new version? Or too slow where it comes to data retrieval during regular operations?

Here is how you could do it. Given a table CUSTOMERS with a business key of CUSTOMER_REF I might normally build it like this (I am using abbreviated syntax rather than best practice for reasons of space):

create table customers 
( id number not null primary key
  , customer_ref number not null unique key
  , name varchar2(30) not null )
/

The versioned equivalent would look like this:

create table customers 
( id number not null primary key
  , customer_ref number not null 
  , version_number number
  , name varchar2(30) not null
  , constraint whatever unique (customer_ref, version_number) )
/

This works by keeping the current version of VERSION_NUMBER null, and only populating it at archival time. Any lookup is going to have to include and version_number is null. This will be a bit of a pain and you may need to include the column in any additional indexes you build.

Obviously maintaining all versions of the records in the same table will increase the size of your tables, which might have an effect on performance. Oracle's Partitioning option can definitely help here. It also would give you a neat way of creating next year's set of data. However, it is a chargeable extra on top of the Enterprise License, so it is an expensive option. Find out more..

The most time consuming aspect of this will be managing foreign key relationships in the new version of the table. Presuming you choose to use synthetic primary keys, the archival process will have to generate new IDs and then painstakingly cascade them to their dependent records in the new versions of referencing foreign keys.

Thinking about this makes discreet tables for each version seem very attractive. For ease of use I would keep the current version un-prefixed, so that archiving becomes a process simply of

create table customers_n as select * from customers; 

You might want to avoid downtime while creating the versioned tables. In that case you could use materialized views to capture the tables' state during the run-up to the archival switchover. When the clock strikes twelve you can switch off the refresh. (caveat: this is thinking on the fly, I have never done anything like this so try before you buy.)

One pertinent advantage of multiple tables (and Partitioning) is that you can move the archived records to a READ ONLY tablespace. This not only preserves them from unwanted change, it also means you can exclude them from subsequent backups.

edit

I notice you have commented that the archived data can occasionbally be amended. In taht case moving it to READ ONLY tablespaces is not a go-er.

挽梦忆笙歌 2024-08-26 07:05:34

我对 APC 所说的唯一要补充的是关于您要求的“命名空间”。

Oracle 中的命名空间是一种模式,您可以在每个模式中拥有相同的对象名称。

当然,这一切都取决于您的应用程序必须如何访问多个版本,但在使用某种命名约定来维护同一架构中的表版本之前,我会倾向于每年使用不同的架构。原因是,最终你会做噩梦。至少对于不同的模式,所有 DDL 都可以相同,对对象的所有引用都将相同,并且 ER 建模器和查询工具等工具将在该模式的上下文中工作。数据模型会发生变化,因此在某些时候您可能需要运行一些比较工具,如果您的所有表都以某种版本后缀命名为 funky,则效果不会很好。

添加模式可以使用 fromuser/touser 或 remap_schema 选项通过导出或数据泵快速复制/移动,因此您不需要太多代码,除了从新版本中清理去年的数据之外。

我发现架构作为“容器”非常有用,并且我托管的大多数应用程序仅具有架构级别权限,因此我保证应用程序可以轻松快速地从一个实例移动到另一个实例,或者可以侧面托管该应用程序的多个副本 -在同一个实例上并排。

The only thing I wil add to what APC said is regarding your asking for "namespaces".

A namespace in Oracle is a schema, whereby you can have the same object name(s) in each schema.

Of course this all depends on how your app must access multiple versions, but I would lean towards a different schema for each year before I would use some sort of naming convention to maintain versions of tables in the same schema. The reason is, eventually you will have a nightmares. At least with different schemas, all DDL can be the same, all references to objects will be the same, and tools like ER modellers and query tools will work within the context of that schema. Data models change, so at some point you may need to run some compare tools, and if all your tables are named funky with some sort of version postfix, that won't work well.

Add a schema can be copied / moved with export or data pump quickly using the fromuser/touser or remap_schema options, so you won't need much code, except to do any cleanup of last years data out of the new version.

I find schemas are very useful as "containers" and most apps I host only have schema level privileges, so I'm guaranteed the app can be easily and quickly moved from instance to instance, or multiple copies of the app can be hosted side-by-side on the same instance.

三人与歌 2024-08-26 07:05:34

多年来架构可能会发生变化。例如,2010 年您有 15 列,但 2011 年您添加了第 16 列。
如果是这样,同一个应用程序是否可以同时处理 2010 年和 2011 年的数据。

如果架构是静态的,我会选择带有“YEAR”列的表并使用 VPD/RLS/FGAC 应用 YEAR = '2010' 谓词。

如果性能有问题,我只会担心分区。

Might the schema change between years. For example, in 2010 you have fifteen columns but in 2011 you add a sixteenth.
If so, will the same application work on both 2010 and 2011 data.

If the schema is static, I'd go for table with a 'YEAR' column and use VPD/RLS/FGAC to apply a YEAR = '2010' predicate.

I'd only worry about partitioning if performance was a problem.

好听的两个字的网名 2024-08-26 07:05:34

1) 间隔按年份和行中的某些日期字段进行分区。

2) 将其添加到每个表的末尾,并使用序列和触发器填充它。

3) 然后在此列上按间隔年份进行分区。

1) Interval partition it by year and some date field in the row.

2) Add it at the end of each table and populate it with a sequence and trigger.

3) Then partition by interval year on this col.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文