使用 Ferret 构建独特的标签云

发布于 2024-08-14 09:06:20 字数 517 浏览 4 评论 0原文

我一直在我正在从事的一个小项目中使用 Ferret 作为我的全文搜索引擎。

通过文档和一些在线示例,我已经能够使用全文索引组合一个标签云生成器,以帮助使用 IndexReader.terms 方法生成标签云。

到目前为止,当我想根据搜索结果获取术语数据时,它的效果非常好。

例如,如果用户搜索“蛋糕”,我想向他们显示与术语“蛋糕”相关的术语标签云。

我一直在寻找 terms 方法可以与搜索结果集或类似内容结合使用的示例?

目前,我正在使用以下方法来生成标签列表:

reader = Ferret::Index::IndexReader.new(Scrape.find_last_index_version)
terms = []
reader.terms(:all_quotes).each do |term, doc_freq|
    terms << [term, doc_freq]
end

干杯。

I've been using Ferret as my full-text search engine in a small project I'm working on.

Through the documentation and a few examples online, i've been able to pull together a tag cloud generator using the full-text index to help with tag cloud generation using the IndexReader.terms method.

It's worked quite well up to now, when I want to get term data based on a search result.

For example, if the user searches for "cake", I want to show them a tag cloud of terms used in association with the term "cake".

I've been looking for examples of where the terms method can be used in association with a search result set or similar?

Currently I'm using the following method to generate my list of tags:

reader = Ferret::Index::IndexReader.new(Scrape.find_last_index_version)
terms = []
reader.terms(:all_quotes).each do |term, doc_freq|
    terms << [term, doc_freq]
end

Cheers.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

攀登最高峰 2024-08-21 09:06:20

它更像是一个词频图表(如 wordle)而不是标签云?或者这些在标签字段中?无论如何,索引不会跟踪每个可能的文档子集中的术语频率(例如搜索结果),因此即使该方法存在,该方法也不会很快。对于单个文档,您可以获取 TermFreqVector 并提供与该文档中其他常用术语良好匹配的建议文档。因此,您可以获取一些最重要的结果,从每个结果中获取术语向量,然后将它们相加,但这些聚合函数本身并不存在(它们通常尝试不将缓慢的操作放在那里。)

It's more like a term frequency chart (like a wordle) than a tag cloud? Or are these in a tag field? Anyway, the index doesn't keep track of term frequency within each possible document subset (such as the results of a search), so that method wouldn't be fast, even if it existed. For a single document, you can get the TermFreqVector and provide suggested documents that are good matches for other frequent terms in that document. So, you could take some of the top results, grab the term vectors from each one, and just add them up, but those aggregate functions don't exist natively (they generally try not to put slow operations in there.)

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文