使用 EC2 和 Asp.Net 进行关键字搜索 Amazon SimpleDB 的最佳方式?
我想知道是否有人对从 EC2 Asp.Net 应用程序在 Amazon SimpleDB 上执行关键字搜索的最佳方式有任何想法。
我正在考虑的几个选项是:
1)将关键字添加到多值属性并使用如下查询进行搜索: 从keywordTable中选择id,其中keyword ='firstword'交集keyword='secondword'交集keyword ='thirdword'
2) 创建 Katta 的 Web 服务前端:
3) 排队的 Lucene.Net 更新服务,定期将 Lucene 索引推送到云端。 (为了解决“锁定”问题)
I am wondering if anyone has any thoughts on the best way to perform keyword searches on Amazon SimpleDB from an EC2 Asp.Net application.
A couple options I am considering are:
1) Add keywords to a multi-value attribute and search with a query like:
select id from keywordTable where keyword ='firstword' intersection keyword='secondword' intersection keyword = 'thirdword'
2) Create a webservice frontend to Katta:
3) A queued Lucene.Net update service that periodically pushes the Lucene index to the cloud. (to get around the 'locking' issue)
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
如果您正在寻找严格的 SimpleDB 解决方案(按照所述问题),Katta 和 Lucene 不会帮助您。 如果您只是寻找基于“亚马逊基础设施”的解决方案,那么任何选择都可以。
所有三个选项的不同之处在于您需要执行多少设置和管理,并根据您的实际要求来决定哪个最好。
如果您需要简单性和最少的管理,具有名为 Keyword 的多值属性的 SimpleDB 是您的最佳选择。 如果您不需要按相关性排序。 无需设置或管理任何内容,您只需为实际的 cpu 和内存付费。 带宽。
如果您需要的不仅仅是关键字搜索,但您需要自己管理索引的更新,那么 Lucene 是一个不错的选择。 您还必须管理使用 SimpleDB 可以获得的负载平衡、备份和故障转移。 如果您不关心故障转移,并且可以在 EC2 崩溃时进行恢复时容忍停机时间,那么就少了一件需要担心的事情,也少了一个选择 SimpleDB 的理由。
借助 EC2 上的 Katta,您可以自己管理一切。 您将拥有最大的灵活性和最多的工作要做。
If you are looking for a strictly SimpleDB solution (as per the question as stated) Katta and Lucene won't help you. If you are looking for merely an 'Amazon infrastructure' based solution then any of the choices will work.
All three options differ in terms of how much setup and management you'll have to do and deciding which is best depends on your actual requirements.
SimpleDB with a multi-valued attribute named Keyword is your best choice if you need simplicity and minimum administration. And if you don't need to sort by relevance. There is nothing to set up or administer and you'll only be charged for your actual cpu & bandwidth.
Lucene is a great choice if you need more than keyword searching but you'll have manage updates to the index yourself. You'll also have to manage the load balancing, backups and fail over that you would have gotten with SimpleDB. If you don't care about fail over and can tolerate down time while you do a restore in the event of EC2 crash then that's one less thing to worry about and one less reason to prefer SimpleDB.
With Katta on EC2 you'd be managing everything yourself. You'd have the most flexibility and the most work to do.
为了解决这个问题...我们通过为 Lightspeed 编写一个自定义搜索提供程序来使用 Lightspeed 的 SimpleDB 提供程序、Solr 和 SolrNet。
有关为 Lightspeed 实现 ISearchEngine 接口的信息:
http: //www.mindscape.co.nz/blog/index.php/2009/02/25/lightspeed-writing-a-custom-search-engine/
这是我们正在使用的 Solr 库:
http://code.google.com/p/solrnet/
由于 Solr 可以轻松扩展使用 EC2 机器,这对我们来说最有意义。
Just to tidy up this question... We wound up using Lightspeed's SimpleDB provider, Solr and SolrNet by writing a custom search provider for Lightspeed.
Info on implementing ISearchEngine interface for Lightspeed:
http://www.mindscape.co.nz/blog/index.php/2009/02/25/lightspeed-writing-a-custom-search-engine/
And this is the Solr Library we are using:
http://code.google.com/p/solrnet/
Since Solr can be easily scaled using EC2 machines, this made the most sense to us.
Simple Savant 是 SimpleDB 的开源 .NET 持久性库,其中包括对使用 Lucene.NET 的全文搜索的集成支持(我是 Simple Savant 的创建者)。
此处描述了全文索引方法。
Simple Savant is an open-source .NET persistence library for SimpleDB which includes integrated support for full-text search using Lucene.NET (I'm the Simple Savant creator).
The full-text indexing approach is described here.