自动完成/前缀匹配,如 google:: Trie/Dwag/sphinx/lucene
您好,听说 Trie 最适合自动建议/自动完成。 但 dwag 使用的空间更少,所以我想 dwag 应该更好。
另外,如果 Sphinx/Lucene 可以进行前缀匹配,那么为什么我们不应该使用它呢?
Trie/dwag 也适合小桌子?
Hi have heard Trie is best for auto suggester/auto complete.
But dwag uses less space so I guess dwag should be better.
Also if Sphinx/Lucene can do prefix matching so why shouldn't we use it.
Also Trie/dwag is good for small tables ??
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我在 DAWG 中看到的问题; DAWG 相对复杂。您可以通过所有单词获得唯一的路径(以便将某些数据与您通常想要的键相关联),但这比使用仅前缀压缩且具有唯一终端节点的 trie 困难得多。仅当您的内存非常有限(例如在嵌入式设备上)时,通过使用 DAWG 获得的后缀压缩才值得,但自从您提到 Lucene 以来,您可能就不是这样了。
为此任务设计了前缀树。太完美了。如果您需要自己编写,那就应该去那里。当然,已经有很多可用的库,如果您没有特殊需求,那么它们非常适合。
The problem I see with DAWGs; DAWGs are relatively complex. You can get a unique path through all words (in order to associate some data with the key, which you usually want), but it's a lot harder than using a trie which is only prefix-compressed and will have unique terminal nodes. The suffix compression you get by using a DAWG is only worth it if you're very memory limited (like on an embedded device), which you probably aren't since you mention Lucene.
A prefix tree is designed for this task. It's perfect. If you need to write your own, that's where you'd go. Of course, there are also many libraries available for this already, perfectly suitable if you don't have special needs.