我从哪里开始学习 Lucene.NET Solr Hadoop 和 MapReduce?

发布于 2024-09-14 13:52:22 字数 166 浏览 5 评论 0原文

我是一名 .NET 开发人员,我需要学习 Lucene,以便我们可以运行非常大规模的搜索服务,删除最终用户无权访问的条目。 (即用户可以搜索具有 3 级或更高权限的所有文档,但不能搜索 2 级或 1 级权限的文档)

我从哪里开始学习,我应该考虑哪些产品?老实说,我有点不知所措,但我决心最终解决这一切。

I'm a .NET developer and I need to learn Lucene so we can run a very large scale search service that removes entries that the end user doesn't have access to. (ie a User can search for all documents with clearance level 3 or higher, but not clearance level 2 or 1)

Where do I start learning, which products should I consider? To be honest, I'm a little overwhelmed, but I'm determined to figure it all out... eventually.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

永不分离 2024-09-21 13:52:22

如果您想要一本涵盖 Lucene 所有基础知识的书,请考虑“Lucene in Action”。尽管代码示例是 Java,您也可以轻松地将它们移植到 .NET。当然,网络上还有大量资源,例如 SO 和 Lucene 邮件列表,它们应该可以帮助您。

对于您描述的项目,您应该查看 Solr,因为它抽象出了许多可扩展性等问题,并且通过 Solrnet 可以轻松集成到您的 .NET 应用程序中。要按级别限制访问,您的索引文档应包含一个名为“Level”的字段(例如),并且在用户查询的后台,您使用布尔查询构造附加“Level:Level-1”查询。

在此阶段,我的建议是在您的项目中远离 Hadoop(Apache Map-reduce 实现)并坚持使用 Solr。不过,如果您热衷于了解它。它也有一本非常有用的书,你猜对了“Hadoop In Action”(也来自 Manning Publications) )。

If you want a book that covers all the basics of Lucene, consider "Lucene in Action". Even though the code samples are Java, you can easily port them to .NET. Of course, there also are tonnes of resources on the web, such as SO and the Lucene mailing lists which should help you along.

For project you describe, you should look at Solr since it abstracts out lots of the issues of scalability etc. and via Solrnet can easily integrate into your .NET app. To restrict access by a level, your index documents should contain a field called "Level" (say) and in the background of your user query, you append the "Level:Level-1" query, using a boolean query construct.

At this stage, my recommendation would be to stay away from Hadoop (Apache Map-reduce implementation) for your project and stick with Solr. If you are however keen to learn about it. It too has a very useful book, you guessed it "Hadoop In Action" (also from Manning Publications).

云仙小弟 2024-09-21 13:52:22

您似乎对每个项目(Lucene/Solr/Hadoop/等)的具体用途感到困惑。因此,首先要做的就是了解每个项目的目的。阅读有关它们的文档和博客。如果可能的话,购买并阅读有关它们的书籍。

例如,MapReduce 和 Hadoop 与您的安全需求无关。 Hadoop 是一个分布式、可扩展的计算平台。但是 Solr 本身是可扩展的。您可能希望使用 Hadoop 来分发爬虫程序(例如 Nutch)

You seem to be confused about what exactly each project (Lucene/Solr/Hadoop/etc) does. So the first thing to do would be understanding the purpose of each project. Read the docs and blogs about them. If possible, buy and read books about them.

For example, MapReduce and Hadoop have nothing to do with your security requirements. Hadoop is a platform for distributed, scalable computing. But Solr is scalable on its own. You might want to use Hadoop to distribute a crawler though (e.g. Nutch).

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文