Informix DB 上索引(索引?)的使用

发布于 2024-10-10 19:29:48 字数 938 浏览 5 评论 0原文

我的 Informix 数据库中有一个包含超过 500 万条记录的表。

这不是实际的表格,但会显示我遇到的问题。

表:销售额
列:sale_id、sale_confirmed、vendor_id、purchase_id
索引: idx1(sale_id)、idx2(sale_confirmed)、idx3(vendor_id)、idx4(purchaser_id)

如果我执行这样的查询:

select *  
from sales  
where sale_confirmed IS NULL  
or sale_confirmed = '' 

那么查询将在大约 4 或 5 秒内完成。

如果我执行这样的查询:

select *  
from sales  
where vendor_id = 12345 
or purchaser_id = 12345

那么查询将在大约 4 或 5 秒内完成。

但是,如果我运行此查询(前 2 个查询的组合):

select *  
from sales  
where (sale_confirmed IS NULL  
        or sale_confirmed = '' )
and (vendor_id = 12345 
       or purchaser_id = 12345)

则该查询运行了 15 分钟,然后我取消了它。

数据库似乎不够智能,无法串联使用不同的索引,即它似乎无法使用 idx2 查找 X 行并在该 X 行中使用 idx3 和 idx4 - 是这样吗,我本以为这样做会足够聪明吗?

有没有办法在处理 WHERE 子句的第二部分时强制数据库使用 idx3 和 idx4?

除了创建新索引之外还有其他解决方案吗?

谢谢。

I have a table with 5 million+ records on an Informix database.

This isn't the actual table but will show the problem I'm having.

Table: sales
Columns: sale_id, sale_confirmed, vendor_id, purchaser_id
Indexes: idx1(sale_id), idx2(sale_confirmed), idx3(vendor_id), idx4(purchaser_id)

If I do a query like this:

select *  
from sales  
where sale_confirmed IS NULL  
or sale_confirmed = '' 

then the query runs to completion in about 4 or 5 seconds.

If I do a query like this:

select *  
from sales  
where vendor_id = 12345 
or purchaser_id = 12345

then the query runs to completion in about 4 or 5 seconds.

However, if I run this query (a combination of the 2 previous queries):

select *  
from sales  
where (sale_confirmed IS NULL  
        or sale_confirmed = '' )
and (vendor_id = 12345 
       or purchaser_id = 12345)

then the query ran for 15 mins before I cancelled it.

The database doesn't seem to be smart enough to use the different indexes in tandem i.e. it doesn't seem to be able to use idx2 to find X number of rows AND use idx3 and idx4 within that X number of rows - is that right, I would have thought it would be smart enough to do this?

Is there a way of forcing the database to use idx3 and idx4 when processing the second part of the WHERE clause?

Any other solutions short of creating new indexes?

Thanks.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

帅冕 2024-10-17 19:29:49

尝试使用 UNION,其中每个部分只需选择两个索引:

select *  
from sales  
where (sale_confirmed IS NULL  
        or sale_confirmed = '' )
and vendor_id = 12345


UNION

select *  
from sales  
where (sale_confirmed IS NULL  
        or sale_confirmed = '' )
and purchaser_id = 12345

如果 Informix 支持内联视图,则根据供应商/购买者获取行集,然后从该集中排除未经确认的销售。

select inlineview.*  from
(
select *  from sales  
where vendor_id = 12345 or purchaser_id = 12345
) as inlineview
where (sale_confirmed IS NULL or sale_confirmed = '')

最后,我认为您可能希望删除 sale_confirmed 上的低基数索引。

PS 我的数据库中通常不会有允许空字符串、NULL 和其他值的列。如果你有的话,我会将该字段限制为 BIT 类型,即 1 和 0,默认值为 0。

Try it with a UNION where only two indices have to be chosen from in each part:

select *  
from sales  
where (sale_confirmed IS NULL  
        or sale_confirmed = '' )
and vendor_id = 12345


UNION

select *  
from sales  
where (sale_confirmed IS NULL  
        or sale_confirmed = '' )
and purchaser_id = 12345

And if Informix supports inline views, get the set of rows based on vendor/purchaser and then from that set exclude the unconfirmed sales.

select inlineview.*  from
(
select *  from sales  
where vendor_id = 12345 or purchaser_id = 12345
) as inlineview
where (sale_confirmed IS NULL or sale_confirmed = '')

Finally, I think you might want to drop the low-cardinality index on sale_confirmed.

P.S. I would normally not have a column in my database that allowed the empty string, NULL, and other values. I'd constrain the field probably to a BIT type if you have it, with 1 and 0, with 0 default.

初雪 2024-10-17 19:29:49

“sale_confirmed”上的索引不太可能有用,因为“sale_confirmed”的基数似乎很低(NULL,是,否?)。更好的模式设计会在“sale_confirmed”上强制执行 NOT NULL,并且 CHECK 约束会强制执行“Y”或“N”,默认情况下可能会给您“N”,除非您另外指定。这将避免必须对“sale_confirmed”进行 OR 运算,这很混乱。

Tim 建议的 UNION 技术可能是一个不错的解决方法。

The index on 'sale_confirmed' is unlikely to be useful because the cardinality of 'sale_confirmed' seems to be low (NULL, yes, no?). A better schema design would enforce NOT NULL on 'sale_confirmed' and a CHECK constraint would enforce 'Y' or 'N' and a default could give you 'N' unless you specified otherwise. That would avoid having to do OR operations on 'sale_confirmed', which are messy.

The UNION technique suggested by Tim is likely to be a decent workaround.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文