仅使用一组示例数据创建生产的测试数据库副本

发布于 2024-11-02 07:48:53 字数 372 浏览 5 评论 0原文

我们有 2 个 Oracle 数据库。一是我们的生产客户数据库。另一个是测试数据库。测试数据库是生产数据库的副本,没有任何数据......只是空表。我试图想出一种高效且有效的方法来将样本数据集从生产数据库复制到测试。

我需要从客户表中的 PK 示例列表开始,并编写一个流程以使用该示例填充测试数据库客户表。然后,我需要根据示例使用相关数据填充子表。

我们曾经有一个 Java 进程,它使用准备好的语句从生产中选择数据...根据结果集构建一个文件...然后使用更新语句将其放入测试数据库中。这是极其低效的。

我不是 DBA,所以我对这个领域了解不多。我尝试谷歌搜索一些有关数据库链接的信息,但找不到很好的例子。

有谁知道如何做到这一点......或者至少给我一个起点?

We have 2 Oracle databases. One is our Production customer database. The other is a test database. The Test database is a copy of production without any data...just empty tables. I am trying to come up with an efficient and effective way to copy a sample set of data from the Production db to the Test.

I need to start with a sample list of PKs from our Customer table and write a process to populate the test database Customer table with that sample. Then, I need to populate child tables with relevant data based on the sample.

We used to have a Java process that would use a prepared statement to select data from Production...build a file based on the result set...then use an update statement to get it into the Test db. It was horribly inefficient.

I am not a DBA so I don't know much of that realm. I tried to Google some info about Database Links but couldn't find very good examples.

Does anyone know of a way to do this...or, at least give me a starting point?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

梦幻之岛 2024-11-09 07:48:53

如果你有足够的空间,我建议复制整个生产数据来测试。这将更容易管理,它也可能是测试备份(从备份恢复到新实例)的好机会。

从开发人员的角度来看,如果没有一组具有代表性的数据,您将无法可靠地测试应用程序的性能。该数据集应具有与生产数据相同的属性(数据量、物理分布......)。实现这一目标的最简单方法是在测试中使用与生产中相同的数据。

如果您可以承受停机时间,则可以停止生产数据库,将文件复制到测试服务器并安装两个数据库。如果您无法承受停机时间,那么学习一些 DBA 技能可能是个好主意(并最终了解热备份,然后恢复到新实例)。


更新:如果物理复制数据库不可行,您应该考虑使用 expdpimpdp(或旧的 exp/<代码>imp)。您可以复制所有架构或过滤数据导出。在这种情况下,您可以手动选择适当的 WHERE 子句。批量导出和导入将比逐行复制数据快几个数量级。

If you have enough space, I would suggest copying the entire production data to test. It will be a lot easier to administer, it could also be a good oportunity to test your backup (restore from backup to a new instance).

From a developer point of view, you won't be able to test the performance of your application reliably without a representative set of data. This data set should have the same properties as the production data (data volume, physical distribution...). The easiest way to achieve this is to have the same data in test as in production.

If you can afford downtime you could stop the production db, copy the file to the test server and mount both databases. If you can't afford downtime it might be a good idea to pickup some DBA skills (and eventually learn about hot backup then restore to a new instance).


Update: if physically copying your database is not feasible, you should look into bulk copying data with expdp and impdp (or the old exp/imp). You can either copy all schemas or filter the data on export. You would choose appropriate WHERE clause by hand in this case. Exporting and importing in bulk will be orders of magnitude faster than copying data row by row.

黎歌 2024-11-09 07:48:53

我只是想指出行业场景和产品..不是解决方案,而是供参考的输入,正如 @aintnoprophet 提到的那样

“我不是 DBA,所以我不太了解
那个境界。我尝试谷歌一些
有关数据库链接的信息但不能
找到非常好的例子。”

这个用例与测试数据管理系统相关。这个概念涉及这样一个事实:我们需要有一种方法来管理测试数据,该方法涉及到某些特定测试场景的特定测试用例。它将从大量数据中手动创建数据子集非常困难,会显着增加人为错误的机会。因此,多次测试迭代的重复数据子集很快就会变得难以管理,

特别是在企业示例中 -

<一href="http://www.compuware.com/resources/mainframe/Fileaid_clientserver.pdf" rel="nofollow">http://www.compuware.com/resources/mainframe/Fileaid_clientserver.pdf

http://www.erphome.net/wdb/upload/forum1_f_8007_1227768589.pdf

这些是针对庞大的企业数据的,

有一个开源的tdm工具http://sourceforge.net/projects/opentdm/ 但不在java(我不确定它是否好)

如果您正在寻找产品(仅供参考)其中一些是

i just want to point out the industry scenario and products .. not a solution but an input for reference as @aintnoprophet mentioned that

"I am not a DBA so I don't know much of
that realm. I tried to Google some
info about Database Links but couldn't
find very good examples."

This use case is related to test data management systems. This concept deals with the fact that we need to have a way to manage test data, that goes into those specific test cases for some specific test scenarios.It will be very difficult to manually create the subset of data from large amounts of data, you significantly increase the chance of human error.So repeated data sub-setting for multiple test iterations quickly becomes unmanageable.

there are many use cases, especially in enterprise examples -

http://www.compuware.com/resources/mainframe/Fileaid_clientserver.pdf

http://www.erphome.net/wdb/upload/forum1_f_8007_1227768589.pdf

these are for huge enterprise data.

there is an open source tdm tool http://sourceforge.net/projects/opentdm/ but not in java (i m not sure whether its good or not)

if you are looking for products (jut for reference) some among are

淡淡離愁欲言轉身 2024-11-09 07:48:53

作为生产的完整副本的测试数据库比清除错误的子集要好得多。大多数人进行子集化是因为他们没有时间和资源来制作完整副本,但所有这一切都随着虚拟数据(即副本数据管理)而改变。在此类系统中,您可以获取一份生产副本并在几分钟内制作精简克隆副本,几乎无需额外存储。该系统通常还会压缩数据,因此摄取副本的大小是原始副本的 1/3,并且具有在几分钟内快速刷新的优点,这在破坏性测试中非常有用。它们还倾向于与屏蔽集成,因为我们可能不希望敏感数据从生产环境进入开发和测试。该地区有许多供应商。首先查看 Netapp SMO、Actifio、Oracle Snap Clone 和 Delphix。其中大多数需要一些专门的硬件,这可能是一个障碍。 Delphix 是完全软件,甚至有一个名为 Delphix Express 的免费版本可供下载。 Rubrick 和 Cohesity 正在将类似的技术用于 VM 映像。

最好的祝愿
凯尔·海利

A test databases that are full copies of production will work much better than a subset for flushing out bugs. Most people subset because they don't have the time and resources to make full copies but all that has changed with virtual data aka copy data management. In such systems you can ingest one copy of production and make thin clone copies in minutes for almost no extra storage. This systems also generally compress the the data so the ingest copy is 1/3 size of the original and have the advantage of fast refresh in minutes which is great in destructive testing. They also tend to come integrated with masking since we probably don't want sensitive data coming out of the production environment into dev and test. There are a number of vendors in this area. Check out Netapp SMO, Actifio, Oracle Snap Clone and Delphix for starters. Most of these require some specialized hardware which can be an obstacle. Delphix is fully software and even has a free version called Delphix Express that you can download. Similar technology is being use by Rubrick and Cohesity for VM images.

Best Wishes
Kyle Hailey

葮薆情 2024-11-09 07:48:53

看起来您正在寻找的是 Jailer。它从关系数据库中导出一致的、引用完整的行集。

this looks like what you look for is Jailer. It exports consistent, referentially intact row-sets from relational databases.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文