当前位置：文江博客话题详情

我在哪里可以了解有关 Google 搜索“您的意思是”的更多信息吗？算法？

发布于 2024-09-24 08:25:08 字数 355 浏览 1 评论 0原文

可能的重复：
如何实现“您是说”吗？ < /p>

我我正在编写一个应用程序，我需要类似于谷歌的“你是说吗？”的功能。他们的搜索引擎使用的功能：

alt text

是否有可用于此类内容的源代码，或者我在哪里可以找到可以使用的文章帮我建立自己的？

原文

分享到QQ

分享到微博

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

需要登录才能够评论，你可以免费注册一个本站的账号。

折戟 2024-10-01 08:25:08

您应该查看 Peter Norvigs 的文章，了解如何用几行 Python 代码实现拼写检查器：
如何编写拼写校正器它还有其他语言（即 C#）实现的链接

回复收藏 0 原文

晨曦慕雪 2024-10-01 08:25:08

一年半前，我参加了一位谷歌工程师举办的研讨会，他们在会上谈论了他们的解决方法。演讲者说他们的算法（至少部分）根本没有智能；相反，利用他们可以访问的大量数据。他们确定，如果有人搜索“Brittany Spears”，没有点击任何内容，然后再次搜索“Britney Spears”，并点击某些内容，我们就可以对他们搜索的内容有一个合理的猜测，并可以建议未来。

免责声明：这可能只是他们算法的一部分

回复收藏 0 原文

苦笑流年记忆 2024-10-01 08:25:08

Python 有一个名为 difflib 的模块。它提供了一个名为 get_close_matches 的功能。来自 Python 文档：

get_close_matches(word,possibility[,n][,cutoff])
返回最佳“好”的列表
“足够”匹配。单词是一个序列
需要紧密匹配的
（通常是一个字符串），以及
可能性是要匹配的序列列表
单词（通常是字符串列表）。
可选参数n（默认
3) 是最大关闭次数
匹配返回； n 必须是
大于0。
可选参数截止（默认
0.6) 是 [0,
1]。不得分的可能性
至少与单词相似的是
被忽略。
最佳匹配（不超过n）
返回的可能性之一
在列表中，按相似度排序
得分，最相似的优先。

  >>> get_close_matches('appel', ['ape', 'apple', 'peach', 'puppy'])
  ['apple', 'ape']
  >>> import keyword
  >>> get_close_matches('wheel', keyword.kwlist)
  ['while']
  >>> get_close_matches('apple', keyword.kwlist)
  []
  >>> get_close_matches('accept', keyword.kwlist)
  ['except']

这个图书馆可以帮助你吗？

Python has a module called difflib. It provides a functionality called get_close_matches. From the Python Documentation:

get_close_matches(word, possibilities[, n][, cutoff])
Return a list of the best "good
enough" matches. word is a sequence
for which close matches are desired
(typically a string), and
possibilities is a list of sequences against which to match
word (typically a list of strings).
Optional argument n (default
3) is the maximum number of close
matches to return; n must be
greater than 0.
Optional argument cutoff (default
0.6) is a float in the range [0,
1]. Possibilities that don't score
at least that similar to word are
ignored.
The best (no more than n) matches
among the possibilities are returned
in a list, sorted by similarity
score, most similar first.

  >>> get_close_matches('appel', ['ape', 'apple', 'peach', 'puppy'])
  ['apple', 'ape']
  >>> import keyword
  >>> get_close_matches('wheel', keyword.kwlist)
  ['while']
  >>> get_close_matches('apple', keyword.kwlist)
  []
  >>> get_close_matches('accept', keyword.kwlist)
  ['except']

Could this library help you?

回复收藏 0 原文

唔猫 2024-10-01 08:25:08

您可以使用 http://developer.yahoo.com/search/web/V1 /spellingSuggestion.html 这将提供类似的功能。

回复收藏 0 原文

海的爱人是光 2024-10-01 08:25:08

您可以查看提供此功能的 Xapian 源代码，就像许多其他搜索库一样。 http://xapian.org/

回复收藏 0 原文

流绪微梦 2024-10-01 08:25:08

我不确定它是否符合您的目的，但带有字典的字符串编辑距离算法可能足以满足小型应用程序的需求。

回复收藏 0 原文

救赎№ 2024-10-01 08:25:08

我想看一下这篇关于 Google 轰炸的文章。它表明它只是根据先前输入的结果建议答案。

回复收藏 0 原文

唐婉 2024-10-01 08:25:08

AFAIK“你是说吗？”功能不检查拼写。它只是根据谷歌解析的内容为您提供另一个查询。

回复收藏 0 原文

゛清羽墨安 2024-10-01 08:25:08

有关此主题的精彩章节可以在公开的信息检索简介。

回复收藏 0 原文

骄傲 2024-10-01 08:25:08

您可以使用 ngram 进行比较： http://en.wikipedia.org/wiki/N- gram

使用 python ngram 模块： http://packages.python.org/ngram/ index.html

import ngram

G2 = ngram.NGram([  "iis7 configure ftp 7.5",
                    "ubunto configre 8.5",
                    "mac configure ftp"])

print "String", "\t", "Similarity"
for i in G2.search("iis7 configurftp 7.5", threshold=0.1):
    print i[0], "\t", i[1]

得到：

>>> 
String  Similarity
"iis7 configure ftp 7.5"    0.76
"mac configure ftp  0.24"
"ubunto configre 8.5"   0.19

U could use ngram for the comparisment: http://en.wikipedia.org/wiki/N-gram

Using python ngram module: http://packages.python.org/ngram/index.html

import ngram

G2 = ngram.NGram([  "iis7 configure ftp 7.5",
                    "ubunto configre 8.5",
                    "mac configure ftp"])

print "String", "\t", "Similarity"
for i in G2.search("iis7 configurftp 7.5", threshold=0.1):
    print i[0], "\t", i[1]

U get:

>>> 
String  Similarity
"iis7 configure ftp 7.5"    0.76
"mac configure ftp  0.24"
"ubunto configre 8.5"   0.19

回复收藏 0 原文

冬天旳寂寞 2024-10-01 08:25:08

看看 Levenshtein-Automata

回复收藏 0 原文

~没有更多了~

关于作者

千笙结

暂无简介

0 文章

0 评论

628 人气

关注发私信

友情链接

文江博客

我在哪里可以了解有关 Google 搜索“您的意思是”的更多信息吗？算法？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（11）

关于作者

相关话题

热门标签

推荐作者

花开柳相依

zyhello

故友

对风讲故事

Oo萌小芽oO

梦明

友情链接

我在哪里可以了解有关 Google 搜索“您的意思是”的更多信息吗？算法？

如果你对这篇内容有疑问，欢迎到本站社区发帖提问 参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。

发布评论

评论（11）

关于作者

相关话题

热门标签

推荐作者

花开柳相依

zyhello

故友

对风讲故事

Oo萌小芽oO

梦明

友情链接

如果你对这篇内容有疑问，欢迎到本站社区发帖提问参与讨论，获取更多帮助，或者扫码二维码加入 Web 技术交流群。