apache cassandra 查询/全文搜索
我一直在玩 apache 的 cassandra 项目。完成了相当多的阅读,我完成了一些相当复杂的示例,包括插入单个和批量数据集,基于键检索单个和多个数据集。 我看过的一些文章包括
http:// www.rackspacecloud.com/blog/2010/05/12/cassandra-by-example http://github.com/digg/lazyboy http://arin.me/blog/wtf-is- a-supercolumn-cassandra-data-model http://www.sodeso.nl/?p=80
我有一个很好地掌握了所解释的概念,甚至实现了一个简单的应用程序。
这些文章都没有描述如何执行查询,例如,查询是用户输入的搜索词。
有谁知道如何或可以建议我如何执行这样的查询? 或者也许是一种创建可搜索索引、全文搜索或任何远程接近的方法?
I've been playing around with apache's cassandra project. Done a fair bit of readin and i have some fairly complex examples that i've done, including inserting single and batch sets of data, retrieving a single and multiple data sets based on keys.
Some of the articles i've looked at include
http://www.rackspacecloud.com/blog/2010/05/12/cassandra-by-example
http://github.com/digg/lazyboy
http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model
http://www.sodeso.nl/?p=80
I've got a fairly good grasp of the concepts explained and have even implemented a simple app.
None of the articles describe how one would go about performing a query where, for eg, the query is a search term a user has typed in.
Does anyone know how or can suggest how i'd go about performing such a query?
Or perhaps a way to create a searchable index, full text search or anything even remotely close?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
您可能会将文本拆分为单词,然后使用这些单词作为“索引”的键。每个单词都将包含时间戳排序的列族,其中包含文章、消息等的 ID 列表。因此,您只能对键(单词)执行简单的搜索。
当搜索多个单词时,请在这些列族上使用交集。
这是非常简单的方法,如果您需要更复杂的查询,请查看 Lucandra - http://github.com/tjake/ Lucandra - Lucandra 是一个全文搜索引擎,以 Cassandra 作为后端存储。
You will probably split text into words, and than use these words as keys to your "index". Each word will contain timestamp ordered column family with list of IDs to your articles, messages etc. So you can only perform simple searches over keys (words).
When searching more than one word, use intersection over these column families.
This is very simple approach, if you need more complex queries look at Lucandra - http://github.com/tjake/Lucandra - Lucandra is a fulltext search engine with Cassandra as backend storage.