全文索引与pattern_ops索引
我正在使用 django,所有查询都是由 django 创建的,所以我没有手写查询...
我有一个 BillRecords
表,其中有一个字段 subscriberno
。在我的 django 过滤器中,我使用如下过滤查询:
BillRecords.objects.filter(subscriberno__icontains='123456')
由于客户所说的 subscriberno
可能是真实数字的相当缩短的版本...
该过滤器输出如下查询:
SELECT "subscriberno" FROM "BillRecords" WHERE UPPER("subscriberno"::text) LIKE UPPER(E'%123456%');
subscriberno
code> 是一个字符字段,因为某些数字包含字母和一些特殊字符。
在我的数据库中,我的同事为该列创建了两个索引。
"BillRecords_subscriberno" btree (subscriberno)
"BillRecords_fsubscriberno_like" btree (subscriberno varchar_pattern_ops)
我想知道对于这样的查询使用两个索引是否合乎逻辑。由于我们所有的 django 过滤器都使用 icontains
并且应该像我上面写的那样创建查询。
Postgres对查询的分析如下:
Seq Scan on BillRecords (cost=0.00..159782.40 rows=370 width=15) (actual time=579.637..3705.079 rows=10 loops=1)
Filter: (upper((subscriberno)::text) ~~ '%123456%'::text)
Total runtime: 3705.106 ms
(3 rows)
所以,据我所知,没有使用索引。由于索引 usega 在数据插入和更新方面有成本,因此拥有两个没有使用的索引(据我从本分析中可以看出)似乎不合逻辑。
django 是否有可能为类似的 icontanis
过滤器输出不同的查询?或者我的索引完全没用?
I am using django, and all of my queries are created by django, so i have no handwritten queries...
I have a table of BillRecords
, which has a field subscriberno
. In my django filters, i use a filtering query like:
BillRecords.objects.filter(subscriberno__icontains='123456')
Since the subscriberno
the customer said might be quite shortened version of the real number...
That filter outputs a query like:
SELECT "subscriberno" FROM "BillRecords" WHERE UPPER("subscriberno"::text) LIKE UPPER(E'%123456%');
subscriberno
is a char field because some numbers contains alphas and some special chars.
On my database, i have two indexes for that column, created by my colleagues.
"BillRecords_subscriberno" btree (subscriberno)
"BillRecords_fsubscriberno_like" btree (subscriberno varchar_pattern_ops)
I am wondering using two indexes for a such query is logical. Since all of our django filter uses icontains
and that supposed to be create queries like i write above.
Postgres analyse of the query is as follows:
Seq Scan on BillRecords (cost=0.00..159782.40 rows=370 width=15) (actual time=579.637..3705.079 rows=10 loops=1)
Filter: (upper((subscriberno)::text) ~~ '%123456%'::text)
Total runtime: 3705.106 ms
(3 rows)
So, as far as i see, no index is used. Since index usega have costs in data insertion and update, having two indexes with no usage (as far as i can see from this analyse) seemed me not logical.
Is there any channce for django to output different queries for a similar icontanis
filter? Or my indexes are totally useless?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
您不能在未锚定的 Like 语句上使用索引。
但是,如果您想减少使用,您可能会发现 trigram contrib搜索窗口。
You cannot use an index on an unanchored like statement.
But you might find the trigram contrib of use, if you want to reduce the search window.
包含(子字符串)查询无权访问索引(除非运算符链接到全文模块)。另一方面,开头查询可以从索引中受益。如果基数不太低并且插入通常不是大批量进行而是在 OLTP 场景中进行,则索引开销可以忽略不计。
我是否正确读取统计数据:扫描 370 行几乎需要 4 秒?
PS 您可能会考虑另一种方法:使用基于函数的索引,也许在连接到订阅者名称的前三个字符的最后四个字符上,并使用开头或等于而不是带有搜索的 LIKE -术语以通配符结尾。
Contains (substring) queries do not have access to indexes (unless the operator is linked to a full-text module). Starts-with queries on the other hand can benefit from indexes. Indexing overhead is negligible if the cardinality is not too low and inserts typically are not made in large batches but in an OLTP scenario.
Do I read the stats correctly: almost 4 seconds to scan 370 rows?
P.S. You might consider an alternative approach: using a function-based index, perhaps on the last four characters of subscriberno concatenated to, say, the first three characters of the subscribername, and using starts-with or equals instead of LIKE with the search-term bookended with wildcards.
检查索引是否被使用的一种简单方法是查看
如果所有查询都像您显示的那样,那么它们肯定不会被使用,因为该模式没有锚定。如果你想解决这个问题,你将不得不使用全文搜索、三元组或类似的东西来重新设计你的搜索。
A simple way to check if your indexes are used at all is to look at
If all your queries are like the one you show, then they certainly won't be used, because the pattern is not anchored. If you want to address that, you will have to re-engineer your search a bit by using full-text search, trigrams, or something like that.