如何在 Azure 表存储中进行自由文本搜索?
我有一个带有 Azure 表存储的解决方案,每个客户最多有几个 tusands“行”(分区键)。
如何才能最好地进行闪电般快速的自由文本搜索?
由于数据的性质,我无法进行空词搜索(例如,搜索“zur”应匹配“Azure”)。
I have a solution with a Azure table storage with up to a few tusands "rows" per customer (partition key).
How do I best do a lightning fast free text search?
Because of the nature of the data I'm not able to do a hole word search (eg. a search for "zur" should match "Azure").
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
刚刚发现这个可能对您有帮助:Azure Library for Lucene
Just spotted this which may help you: Azure Library for Lucene
我们在网站的生产中使用以下内容:我们在 http://websolr.com 并使用当前处于测试阶段的新 Azure 分布式缓存功能缓存结果。在最坏的情况下,运行 websolr.com 的 Amazon 数据中心与 Azure 数据中心之间的初始搜索请求延迟为 200 毫秒,所有缓存搜索的平均延迟为 6 - 10 毫秒。我们还记录常见的搜索文本片段,并尝试使它们在缓存中保持新鲜。
We are using the following in production for our sites: We run hosted solr (based on lucene) instances on http://websolr.com and cache the results using the new azure distributed cache feature that is currently in beta. That gives us a worst case 200 ms latency for an initial search request between the Amazon datacenter where websolr.com runs and the Azure Datacenter and an average 6 - 10 ms for all cached searches. We also record common search text fragments and try to keep them fresh in the cache.
目前还没有现成的解决方案。也许全文搜索功能将在 PDC10 上发布。
因此,目前您需要推出自己的文本索引解决方案。我完成此操作的方法是在辅助角色上构建 Lucene.net 索引。然后,我在该辅助角色上打开一个 TCP 端口,该角色使用 WCF 提供搜索服务。然后任何 Web 角色都可以使用该服务。这非常有效并且提供了非常快速的搜索服务。
Steve Marx 的 PDC09 视频提供了更多信息: http://www.microsoftpdc.com/2009/ SVC16
At the moment there is no out of the box solution for this. Perhaps a full text search feature will be announced at PDC10.
So at the moment you will need to roll your own text indexing solution. The way I have done this is by building a Lucene.net index on a worker role. I then open a tcp port on that worker role that provides a search service using WCF. Any web role can then consume that service. This works really well and provides a very fast search service.
There is a PDC09 video by Steve Marx that gives more information: http://www.microsoftpdc.com/2009/SVC16