Lucene.NET +天蓝色+ AzureDirectory 还是其他什么?
早上好。
我目前正在开发一个项目,该项目最初将托管在带有 SQL2k8R2 的物理服务器上,但看起来我们正在转向云和 Azure...由于 SQL Azure 目前不支持全文索引,我一直在查看 Lucene.NET 以及用于后端存储的 AzureDirectory 项目。其工作方式是更新将进入并排队。一旦处理完毕,它们将被放置在 ToIndex 队列中,这将启动 Lucene.NET 索引。我只是想知道是否有更好的方法来做到这一点?我们需要在这个项目中使用Azure,所以如果有更好的解决方案,请告诉我们...托管的主要要求是在欧洲...(Azure 和 Amazon 数据中心在都柏林很方便,在美国的 RackSpace 则不太方便)。
谢谢。
Good morning.
I am currently working on a project which was originally going to be hosted on a physical server with SQL2k8R2, but it looks like we are moving towards the cloud and Azure... Since SQL Azure does not currently support Full Text Indexing, i have been looking at Lucene.NET with the AzureDirectory project for back end storage. The way this will work is that updates will come in and be queued. once processed, they will be placed in a ToIndex queue, which will kick off Lucene.NET indexing. I am just wondering if there would be a better way of doing this? We dont need to use Azure for this project, so if there is a better solution somewhere, please tell us... main requirement for hosting is it is in Europe...(Azure and Amazon Data centers in Dublin is handy, RackSpace in US is not so handy).
Thanks.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我还没有使用过该项目,但它看起来很有希望。据我了解,基本问题是 Lucene 需要文件系统。我看到其他 2 个可能的解决方案(基本上只是执行库的操作):
http://go.microsoft.com/?linkid=9710117
SQLite 还提供全文搜索,但它具有同样的基本问题 - 它需要一个文件系统:
http://www.sqlite.org/fts3.html
I haven't used that project, but it looks promising. From what I understand, the basic issue is that Lucene requires a file-system. I see 2 other possible solutions (basically just doing what the library does):
http://go.microsoft.com/?linkid=9710117
SQLite also has full text search available, but it has the same basic issue - it requires a filesystem:
http://www.sqlite.org/fts3.html
我为您提供了另一种解决方案,但它更加激进,也更加概念化。
您可以使用 Azure 表存储创建自己的索引。根据文档中的每个单词创建分区,因为所有表都在分区键上索引,每个单词搜索应该很快,并且只需对多个单词搜索进行内存连接。
I have another solution for you, but it's a bit more radical, and a bit more of a conceptual one.
You could create your own indexes, using azure table storage. Create partitions based on each word in your documents, as all tables are indexed on the partitionkey, per word search should be fast, and just do memory joins for multiple word searches.
只要您的 Lucene 索引小于 1GB,您就可以将其托管为 Azure 网站。
我最近重写了 Ask Jon Skeet 作为独立的 Azure 网站托管时这样做了。在更新 Lucene 索引之前,它使用 WebBackgrounder 轮询 Stackoverflow API 是否有更改。
You could host it as an Azure Website as long as your Lucene index is less than 1GB.
I did this recently when I rewrote Ask Jon Skeet to be hosted as a self contained Azure Website. It uses WebBackgrounder to poll the Stackoverflow API for changes, before updating the Lucene index.