Lucene.Net、SQL Server、NHibernate、ASP.NET MVC

发布于 2024-09-15 09:37:54 字数 351 浏览 6 评论 0原文

我正在使用这些技术:SQL Server 2005、ASP.NET MVC、NHibernate/sharp 架构,并且想挖掘一些文本,最终目的是呈现一些基于 Web 的统计数据。我有数百万个关键字和数百万个文档,并且希望根据这些由关键字索引的文档运行一些查询。我曾尝试过 SQL Server 的全文索引,但并没有留下太深刻的印象。所以我想知道 Lucene.Net 是否可以作为替代方案。

我从未使用过 Lucene.Net,但了解它是 Java 版本的 1:1 移植。所以我的第一个问题是,如果 Lucene 是正确的“技术”,是否值得研究《Lucene in action》这本书?

谢谢。

最好的祝愿,

克里斯蒂安

I am using these technologies: SQL Server 2005, ASP.NET MVC, NHibernate/sharp architecture and would like to mine some text with the final aim of presenting some web based stats . I have several millions of keywords and several millions of documents and would like to run some queries based on these documents indexed by the keywords. I have played a bit with SQL Server’s full text indexing but I am not too impressed. So I am wondering whether Lucene.Net might be an alternative.

I have never used Lucene.Net but understand that it is a 1:1 port of the Java version. So my first question is whether it is worth studying the book ‘Lucene in action’ – provided that Lucene would be the right ‘technology’?

Thanks.

Best wishes,

Christian

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(1

迷荒 2024-09-22 09:37:54

好吧,

首先 - 更新 SQL Server。您使用的是两代过时的版本,该版本在 SQL Server 中首次实现了全文搜索,并具有许多(已知和已修复的)缺点。

其次 - Lucene 可能确实更适合。 SQL主要是一个数据库服务器,全文检索做了很多事情,但也有很多限制。

但是进入 Lucene 确实带来了一个显着的复杂性——分布式事务、备份处理变得更加复杂,因为它们是两个系统。 SQL 2008 R2 在这方面做得更好(全文索引存储在数据库文件中)。

也就是说,也要小心性能。如果您想并行运行大量查询(这在 Web 应用程序中很容易发生),您可能需要一个相当高端的服务器。这可能需要多个数据库服务器运行只读复制 - SQL Server 比 Lucene 更容易做到这一点(如:开箱即用)。

我建议您获取 Lucene 并使用它;)不需要更多。

Well,

FIRST - update SQL Server. You use a two generations outdated version which had the first implementation of full text search in SQL Server and many (known and fixed) shortcomings.

Second - Lucene may really be better suited. SQL is primarily a database server, and the full text search does a lot of things, but also has a lot of limitations.

But entering Lucene DOES provide a significant complication - distributed transactions, backup handling turn a lot more complicated as they are two systems. SQL 2008 R2 does a much better job here (full text index stored in the database file).

That said, also be careful with performance. You may need a QUITE HIGH END SERVER if you want to run a lot of queries in parallel (which can happen easily with a web application). This may require multiple database servers running read only replications - something SQL Server does a lot easier than Lucene (as in: out of the box).

I suggest you just get Lucene and play with it ;) Not a lot more needed.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文