将以前缀开头的短语与全文搜索相匹配
我正在寻找一种方法来使用 PostgreSQL 中的 tsvector 来模拟类似 SELECT * FROM table WHERE attr LIKE '%text%' 的方法。
我在不使用字典的情况下创建了 tsvector 属性。现在,像 ... ... 这样的查询
SELECT title
FROM table
WHERE title_tsv @@ plainto_tsquery('ph:*');
将返回所有标题,例如“Physics”、“PHP”等。但是我如何创建一个查询来返回标题以“Zend Fram”开头的所有记录(应该返回例如“Zend Framework”)?
当然,我可以使用类似:
SELECT title
FROM table
WHERE title_tsv @@ to_tsquery('zend')
AND title_tsv @@ to_tsquery('fram:*');
但是,这似乎有点尴尬。
所以,问题是:有没有一种方法可以使用以下内容来制定上面给出的查询:
SELECT title
FROM table
WHERE title_tsv @@ to_tsquery('zend fram:*');
I'm looking for a way to emulate something like SELECT * FROM table WHERE attr LIKE '%text%'
using a tsvector in PostgreSQL.
I've created a tsvector attribute without using a dictionary. Now, a query like ...
SELECT title
FROM table
WHERE title_tsv @@ plainto_tsquery('ph:*');
... would return all titles like 'Physics', 'PHP', etc. But how can I create a query that returns all records where the title start with 'Zend Fram' (which should return for instance 'Zend Framework')?
Of course, I could use something like:
SELECT title
FROM table
WHERE title_tsv @@ to_tsquery('zend')
AND title_tsv @@ to_tsquery('fram:*');
However, this seems a little awkward.
So, the question is: is there a way to formulate the query given above using something like:
SELECT title
FROM table
WHERE title_tsv @@ to_tsquery('zend fram:*');
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(4)
相当于:
但当然也发现“Zend 没有框架”。
当然,您可以在 tsquery 匹配之后表达与标题的正则表达式匹配,但是您必须使用解释分析来确保在 tsquery 之后而不是之前执行。
is equivalent to:
but of course that finds "Zend has no framework" as well.
You could of course express a regular expression match against title after the tsquery match, but you would have to use explain analyze to make sure that was being executed after the tsquery instead of before.
Postgres 9.6 引入了全文搜索的短语搜索功能。所以现在可以了:
<-> ;
是 FOLLOWED BY 运算符。它找到 'foo Zend Framework bar' 或 'Zendframes',但找不到 'foo Zend 没有框架 bar'。
引用 Postgres 9.6 的发行说明:
为了获得最佳性能,支持使用 GIN 索引的查询:
或者根本不将
title_tsv
存储在表中(使其膨胀并使写入复杂化)。您可以改用表达式索引:您需要指定文本搜索配置(通常特定于语言)以使表达式不可变。并相应地调整查询:
Postgres 9.6 introduces phrase search capabilities for full text search. So this works now:
<->
being the FOLLOWED BY operator.It finds 'foo Zend framework bar' or 'Zend frames', but not 'foo Zend has no framework bar'.
Quoting the release notes for Postgres 9.6:
For best performance support the query with a GIN index:
Or don't store
title_tsv
in the table at all (bloating it and complicating writes). You can use an expression index instead:You need to specify the text search configuration (often language-specific) to make the expression immutable. And adapt the query accordingly:
这不是一个漂亮的解决方案,但它应该可以完成工作:
它可以像这样使用:
如何工作:
cast(plainto_tsquery('Zend Fram') as text)
:*
前缀匹配器附加到每个搜索词:regexp_replace(..., E'(\'\\w+\')', E'\\1:*' , 'g')
to_tsquery(...)
SELECT title FROM table WHERE title_tsv(title) @@ ...
Not a pretty solution, but it should do the job:
It can be used like:
How this works:
cast(plainto_tsquery('Zend Fram') as text)
:*
prefix matcher to each search term:regexp_replace(..., E'(\'\\w+\')', E'\\1:*', 'g')
to_tsquery(...)
SELECT title FROM table WHERE title_tsv(title) @@ ...
有一种方法可以在 Postgres 中使用 trigrams 和 Gin/Gist 索引来做到这一点。 Kristo Kaiv 的这篇文章中有一个简单的示例,但有一些粗糙的地方: 子字符串搜索。
There's a way to do it in Postgres using trigrams and Gin/Gist indexes. There's a simple example, but with some rough edges, in this article by Kristo Kaiv: Substring Search.