Derby 内存数据库：select into 语句内存不足

发布于 2024-12-21 10:52:58 字数 682 浏览 1 评论 0原文

我必须在项目中使用带有一些巨大桌子的德比。为了获得一些性能，我从三个不同的表中选择文档 ID 到一个巨大的表中以查找文档类型。

基本上是一个非常简单的查询：

"INSERT INTO DOC_TYPES SELECT DOC_ID, 1 FROM TYPE_A_DOCUMENT"
"INSERT INTO DOC_TYPES SELECT DOC_ID, 2 FROM TYPE_B_DOCUMENT"
"INSERT INTO DOC_TYPES SELECT DOC_ID, 3 FROM TYPE_C_DOCUMENT"

当我在“正常”模式下运行这个查询时，德比在我的硬盘上，处理大约需要 1 分钟，最终在数据库中得到大约 650 万行（巨大，我知道...... .）

无论如何，根据我的口味，数据库仍然很慢，所以我试图在内存中运行所有内容。不幸的是，即使堆大小为 4GB（我的硬盘驱动器上的完整数据库从未超过 1GB），操作很快就会出现“java/lang/OutOfMemoryError”错误。

我真的没有看到任何替代方案或解决方法来解决这个问题。 Derby 不支持物化视图，而在 Java 中执行此操作（SELECT、交互结果集、INSERT）将花费数小时...

您也无法“组合”该语句，因为 Derby 似乎不支持 LIMIT。

有什么建议/想法吗？

问候，迈克尔

原文

I have to use a derby with some huge tables in a project. To get some performance, I'm selecting document IDs from three different tables into one huge table for look-up of the doc type.

Basically a very simple query:

"INSERT INTO DOC_TYPES SELECT DOC_ID, 1 FROM TYPE_A_DOCUMENT"
"INSERT INTO DOC_TYPES SELECT DOC_ID, 2 FROM TYPE_B_DOCUMENT"
"INSERT INTO DOC_TYPES SELECT DOC_ID, 3 FROM TYPE_C_DOCUMENT"

When I run this in "normal" mode, with the derby on my hard drive, it takes about 1 Minute to process and I end up with about 6.5 Million rows in the DB (huge, I know...)

Anyway, the database is still way to slow for my taste, so I was trying to run everything in-memory. Unfortunately, even with 4GB heap size (the full DB on my hard drive never exceeds 1GB), the operation results pretty fast in a "java/lang/OutOfMemoryError".

I don't really see any alternatives or workarounds to solve this problem. Derby doesn't support materialized views and doing this in Java (SELECT, interate resultset, INSERT) would take hours...

You also can't "chunk up" the statement, since Derby doesn't seem to support LIMIT.

Any suggestions/ideas?

Regards,
Michael

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

不可一世的女人 2024-12-28 10:52:58

您可以尝试通过模 id 创建块，将模运算的第二个值设置为您需要的块数 (n)。并从 0..(n-1) 重复比较

INSERT INTO DOC_TYPES SELECT DOC_ID, 1 FROM TYPE_A_DOCUMENT WHERE DOC_ID % 2 = 0
INSERT INTO DOC_TYPES SELECT DOC_ID, 1 FROM TYPE_A_DOCUMENT WHERE DOC_ID % 2 = 1

并为每个插入提交事务。

You could try to create chunks by modulo id, set the second value of the modulo operation to the number of chunks (n) you need. and repeat the comparision from 0..(n-1)

INSERT INTO DOC_TYPES SELECT DOC_ID, 1 FROM TYPE_A_DOCUMENT WHERE DOC_ID % 2 = 0
INSERT INTO DOC_TYPES SELECT DOC_ID, 1 FROM TYPE_A_DOCUMENT WHERE DOC_ID % 2 = 1

and commit the transaction for each insert.

回复收藏 0 原文

~没有更多了~