PostgreSQL 中字符串列类型的索引数组
是否可以在文本数组类型的列上创建索引。尝试使用GIN索引,但查询似乎没有使用这些索引。
-- Example:
CREATE TABLE users (
name VARCHAR(100),
groups TEXT[],
);
-- Query:
SELECT name FROM users WHERE ANY(groups) = 'Engineering';
另外,在groups
列上有效执行GROUP BY
的最佳方法是什么,以便它可以提供groups
和计数。
Is it possible to create an index on a column with type of text array. Tried using GIN indexes, but queries do not seem to be using those indexes.
-- Example:
CREATE TABLE users (
name VARCHAR(100),
groups TEXT[],
);
-- Query:
SELECT name FROM users WHERE ANY(groups) = 'Engineering';
Also what is the best way to perform GROUP BY
on groups
column efficiently so that it can give groups
and count.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
可以使用 gin 索引:
结果:
在数组上使用聚合函数,这将是另一个问题。函数 unnest() 可能会有所帮助。
为什么不规范化你的数据?这将解决所有问题,包括许多您尚未遇到的问题。
A gin index can be used:
Result:
Using aggregate functions on an array, that will be another problem. The function unnest() might help.
Why don't you normalize your data? That will fix all problems, including many problems you didn't encouter yet.
我认为处理这个问题的最佳方法是标准化你的模型。由于我没有尝试,以下内容可能会包含错误,但想法应该很清楚:
生成的执行计划应该已经相当高效了;您仍然可以通过索引 ON user_group(group) 进行优化,这允许使用 index_scan 而不是顺序扫描来查找特定组的成员。
I think the best way to handle this would be to normalize your model. The following will probably contain errors as I didn't try it, but the idea should be clear:
The resulting execution plan should be fairly efficient already; you can optimize still by indexing ON user_group(group), which allows an index_scan rather than a sequential_scan to find the members of a particular group.
从 PostgreSQL 9.6 开始,现在有 array_to_tsvector() 函数。
https://pgpedia.info/a/array_to_tsvector.html
所以也许非常有效的是创建tsvectors 并对其使用 gin 索引。
There is array_to_tsvector() function as of PostgreSQL 9.6 now.
https://pgpedia.info/a/array_to_tsvector.html
So maybe quite efficient is to create tsvectors and use gin index on them.