如何使关键字顺序在我的搜索中更相关？

发布于 2024-09-03 07:02:14 字数 948 浏览 3 评论 0原文

在我的数据库中，我有一个关键字字段，用于存储以逗号分隔的关键字列表。

例如，史莱克娃娃可能具有以下关键字：

ogre, green, plush, hero, boys' toys

“Beanie Baby”娃娃（恰好是食人魔）可能具有：（

beanie baby, kids toys, beanbag toys, soft, infant, ogre

这是一个完全人为的示例。）

我想要什么我要做的是，如果消费者搜索“食人魔”，我希望“怪物史莱克”娃娃在搜索结果中排名更高。

我的内容管理员认为，如果关键字在列表中较靠前，它应该获得更高的排名。（这对我来说很有意义，并且让我可以轻松地让他们控制搜索结果的相关性）。

这是一个简化的查询：

SELECT
p.ProductID         AS ContentID
, p.ProductName     AS Title
, p.ProductCode     AS Subtitle
, 100               AS Rank
, p.ProductKeywords AS Keywords
FROM Products AS p
WHERE FREETEXT( p.ProductKeywords, @SearchPredicate )

我正在考虑将 RANK 替换为：

, 200 - INDEXOF(@SearchTerm)            AS Rank

这个“应该”根据关键字结果的相关性对关键字结果进行排名

我知道 INDEXOF 不是一个 SQL 命令...但它就像我想要的那样来完成。

我以正确的方式处理这个问题吗？

可以做这样的事情吗？

这有道理吗？

原文

In my database, I have a keywords field that stores a comma-delimited list of keywords.

For example, a Shrek doll might have the following keywords:

ogre, green, plush, hero, boys' toys

A "Beanie Baby" doll ( that happens to be an ogre ) might have:

beanie baby, kids toys, beanbag toys, soft, infant, ogre

(That's a completely contrived example.)

What I'd like to do is if the consumer searches for "ogre" I'd like the "Shrek" doll to come up higher in the search results.

My content administrator feels that if the keyword is earlier in the list, it should get a higher ranking. ( This makes sense to me and it makes it easy for me to let them control the search result relevance ).

Here's a simplified query:

SELECT
p.ProductID         AS ContentID
, p.ProductName     AS Title
, p.ProductCode     AS Subtitle
, 100               AS Rank
, p.ProductKeywords AS Keywords
FROM Products AS p
WHERE FREETEXT( p.ProductKeywords, @SearchPredicate )

I'm thinking something along the lines of replacing the RANK with:

, 200 - INDEXOF(@SearchTerm)            AS Rank

This "should" rank the keyword results by their relevance

I know INDEXOF isn't a SQL command... but it's something LIKE that I would like to accomplish.

Am I approaching this the right way?

Is it possible to do something like this?

Does this make sense?

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

独﹏钓一江月 2024-09-10 07:02:14

根据你所拥有的以及你不需要修改现有结构，这很好地表明了 SQL Server 在字符串操作方面是多么蹩脚，但它可以工作。遍历逻辑：

DECLARE
  @ProductKeywords varchar(100)
 ,@SearchPredicate varchar(10)

SET @ProductKeywords = 'The,quick,brown,fox,jumps,over'
SET @SearchPredicate= 'fox'

--  Where in the string your search value is
print charindex(@SearchPredicate, @ProductKeywords)

--  The string up through but not including your search string
print left(@ProductKeywords, charindex(@SearchPredicate, @ProductKeywords))

--  Remove the commas (your delimiter) from the above
print replace(left(@ProductKeywords, charindex(@SearchPredicate, @ProductKeywords)), ',', '')

--  This is how many characters are left
print len(replace(left(@ProductKeywords, charindex(@SearchPredicate, @ProductKeywords)), ',', ''))

--  This is how many delimiters you removed,
--  = the number of words (minus one) from the "first" the found word was,
--  = a weighting factor you can use
print charindex(@SearchPredicate, @ProductKeywords) - len(replace(left(@ProductKeywords, charindex(@SearchPredicate, @ProductKeywords)), ',', ''))

将 @ProductKeyword 替换为 p.ProductKeywords ，它应该可以解决问题。（请注意，我对全文查询引擎的经验为零。它可能会也可能不会对此代码产生影响。）

Based on what you have and your need not to modify existing structures, this, well, shows how lame SQL Server is at string manipulation, but it would work. Walking through the logic:

DECLARE
  @ProductKeywords varchar(100)
 ,@SearchPredicate varchar(10)

SET @ProductKeywords = 'The,quick,brown,fox,jumps,over'
SET @SearchPredicate= 'fox'

--  Where in the string your search value is
print charindex(@SearchPredicate, @ProductKeywords)

--  The string up through but not including your search string
print left(@ProductKeywords, charindex(@SearchPredicate, @ProductKeywords))

--  Remove the commas (your delimiter) from the above
print replace(left(@ProductKeywords, charindex(@SearchPredicate, @ProductKeywords)), ',', '')

--  This is how many characters are left
print len(replace(left(@ProductKeywords, charindex(@SearchPredicate, @ProductKeywords)), ',', ''))

--  This is how many delimiters you removed,
--  = the number of words (minus one) from the "first" the found word was,
--  = a weighting factor you can use
print charindex(@SearchPredicate, @ProductKeywords) - len(replace(left(@ProductKeywords, charindex(@SearchPredicate, @ProductKeywords)), ',', ''))

Replace @ProductKeyword with p.ProductKeywords and it should do the trick. (Please note that I have zero experience with the full-text query engine. It may or may not have an impact on this code.)

回复收藏 0 原文

凉薄对峙 2024-09-10 07:02:14

我可以建议另一种方式吗？

是否有一个链接表 ProductKeywords：

ID_ProductKeyword(pk)
ProductID(int)
KeywordID(int)
Weight(int)

它表达了上述关系：关键字和产品之间的关系，以及特定关键字对于该特定产品的重要性（较高的权重对应于较高的索引）。

附带的好处是，您可以根据关键字为用户提供正确产品的频率动态更新权重。或者，如果您发现自己错过了一个关键字关联，您可以轻松添加关键字关联（他们是否在记住史莱克是食人魔之前搜索了myers？）

我的两分钱。

Can I suggest another way?

What about have a linked table ProductKeywords:

ID_ProductKeyword(pk)
ProductID(int)
KeywordID(int)
Weight(int)

This expresses the above relationship: relationship between keyword and product, as well as how important the particular keyword is for that particular product (higher weight corresponds to higher index).

A side benefit is that you can update the weights dynamically based on how frequently the keyword gives the right product for the users. Or, you can add keyword associations easily if you find that you missed one (did they search for myers before remembering that Shrek is an ogre?)

My two cents.

回复收藏 0 原文

~没有更多了~