使用 solr 进行自由文本（自然语言）查询解析

发布于 2024-10-17 05:09:07 字数 474 浏览 2 评论 0原文

我正在尝试为本地搜索网站构建一个查询解析算法，该算法可以将自由文本搜索查询（单个输入文本框）分类为网站上可能的各种类型的可能搜索。

例如，用户可以输入 xyz 附近的中餐馆。我应该如何将其分解为 Cuisine:"chinese", locality:"xyz" 鉴于

- there could be spelling mistakes
- keywords may match in different columns e.g. a restaurant may have "chinese" in its name

这实际上并不是一个自然语言解析问题，因为我们试图在非常有限的一组可能性中进行搜索

我最初的想法是将特定类型的所有值转储到数据库中的字段中，并使用用户查询来匹配所有这些字段。然后根据分数（和预定义的置信度）将查询分为 3-4 个搜索字段，例如名称/美食/地点。

有没有更好/标准的方法来做到这一点。

原文

I'm trying to build a query parsing algorithm for a local search site that can classify a free text search query (single input text box) into various type of possible searches possible on the site.

For e.g. the user could type chinese restaurants near xyz. How should I go about breaking it down to Cuisine:"chinese", locality:"xyz" given that

- there could be spelling mistakes
- keywords may match in different columns e.g. a restaurant may have "chinese" in its name

This is not really a natural language parsing problem since we're trying to search in a very limited set of posiibilities

My initial thoughts are to dump all values of a particular type into a field from the database and use the users query to match in all those fields. Then based on the score (and a predifined confidence level) divide the query into the 3-4 search fields like name/cuisine/locality.

Is there a better/standard way of doing this.

分享到QQ

分享到微博