查询性能,多个“OR”与单独的声明
假设我有一个 40MM 9 位邮政编码的 Oracle 数据库表,其中包括每个邮政编码的 4 位邮政编码。我有 800 个 5 位邮政编码的列表,我需要找到所有相关的 9 位邮政编码。总共有 40K 个 5 位邮政编码。假设我们已经使用 B*Tree(传统)索引对 5 位邮政编码字段进行了索引。考虑到数据库性能和可扩展性,最好的方法是:
在单个语句中查询所有记录?像这样的事情运行一次:
<块引用>从邮政编码中选择 *,其中 ZIP5 IN (:1, :2,...:800)
对每个 5 位邮政编码查询一次?像这样的东西运行了 800 次:
<块引用>从邮政编码中选择 *,其中 ZIP5 = :1
某些批量大小介于两者之间?
你觉得怎么样,为什么?如果您处于思考状态,一些[可选]后续想法:
- 我们可以运行哪些测试来验证您的想法?
- 最佳方法是否会随着数量的变化而变化(例如,如果我们有 100 个 9 位邮政编码映射到 10 个 5 位邮政编码,则查找这 10 个中的 3 个)?
- 对于数据库来说最好的事情与对于吸收查询结果的应用程序服务器来说最好的事情是否不同?
Let's say I have an Oracle database table of 40MM 9-digit ZIP codes which includes the 4-digit ZIP code for each. I have a list of 800 5-digit ZIP codes and I need to find all the associated 9-digit ZIP codes. There are 40K 5-digit ZIP codes altogether. Assume we have indexed the 5-digit ZIP code field with a B*Tree (conventional) index. With database performance and scalability in mind, is the best method to:
Query for all the records in a single statement? Something like this run once:
SELECT * FROM ZIPCODE WHERE ZIP5 IN (:1, :2,... :800)
Query once for each of the 5-digit ZIP codes? Something like this run 800 times:
SELECT * FROM ZIPCODE WHERE ZIP5 = :1
Some batch size in between the two?
What do you think, and why? Some [optional] follow-up thoughts if you're in a thinking mood:
- What tests can we run to validate your thinking?
- Does the best method change as the volumes change (e.g., if we have 100 9-digit ZIP codes mapped to 10 5-digit ZIP codes, looking for 3 of those 10)?
- Is the best thing for the database different than the best thing for the app servers which are absorbing the query results?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
让数据库服务器决定如何处理。即使它在内部执行了 800 次查询,它仍然会更快。它只需解析查询一次,并且只需发送结果一次。
因此,请使用 SELECT * FROM ZIPCODE WHERE ZIP5 IN (:1, :2,... :800)
Let the database server decide how to handle it. Even if it internally does a query 800 times, it will still be faster. It has to parse the query only once, and has to send results only once.
Thus, use
SELECT * FROM ZIPCODE WHERE ZIP5 IN (:1, :2,... :800)
您可以选择创建外部表吗?即,
将 zip5 文本文件放入 Oracle 目录对象指定的操作系统目录中,
然后执行:
这是比每次 ZIP5 列表更改时重建查询更通用的解决方案。
Do you have the option to create an external table? I.e.,
Place your zip5 text file in the OS directory specified by your Oracle directory object,
then execute:
This is a more generic solution than rebuilding your query each time your ZIP5 list changes.