扩展 JackRabbit 还是从 Lucene 构建?

发布于 2024-08-18 19:11:06 字数 403 浏览 10 评论 0原文

我一直在研究一个网站的想法,一般概念是文档的全文搜索,还允许根据这些评级进行用户评级,我想提高项目在 Lucene 索引中的价值。但我正在尝试寻找是否应该扩展 JackRabbit 还是仅从 Lucene 基础构建。有没有什么好的方法可以以这种方式扩展 JackRabbit 并影响索引,或者最好直接在 Lucene 之外工作?

无论哪种方式,我都强烈倾向于在 grails 上使用 groovy 和可搜索插件,或者直接使用 JackRabbit,有什么主要原因让我应该坚持使用 Java 吗?

澄清:

我想根据一个项目的平均用户评分来提升一个项目,JackRabbit 是否足够开放或可扩展性足够,我可以捕获用户评分,然后影响 JackRabbit 内的索引,或者它是否远远超出了核心JackRabbit 我应该从 Lucene 开始构建吗?

I've been working on a site idea the general concept is a full text search of documents that also allows user ratings based on these rating I wanted to boost the item's value in the Lucene index. But I'm trying to find if I should extend JackRabbit or just build from the Lucene base. Is there any good way to extend JackRabbit in this way and effect the index or would it be best to work directly off Lucene?

Either way I go I am strongly leaning to using groovy on grails with either the searchable plugin or work directly with JackRabbit is there any major reasons I should just stick to Java?

Clarification:

I would like to boost an item based on the average user rating of an item, is JackRabbit open enough or expandable enough where I can capture user ratings then have those effect the index within JackRabbit or is it so far out of the core of JackRabbit I should just build up from Lucene?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

风筝在阴天搁浅。 2024-08-25 19:11:06

我建议使用 JCR,并在其背后实现 Jackrabbit。 JCR 允许您区分存储内容和存储方式。

通过留在 JCR 框架内,您应该能够轻松地在 JCR 实现之间切换。 (有好几个,不仅仅是 Apache 的。)即使在 Jackrabbit 中,也有许多持久性管理器,而不仅仅是 Lucene。当您想要在存储空间和性能之间进行权衡时,这种灵活性非常有用。

JCR 已经包含全文搜索和维护用户评级的功能。它应该非常适合您的项目。

I recommend using JCR, with the implementation of Jackrabbit behind it. JCR allows you to separate between what you store and how you store it.

By staying within a JCR framework, you should be able to easily switch among JCR implementations. (There are several, not just Apache's.) Even within Jackrabbit are many persistence managers, not just Lucene. This flexibility is useful when you want to trade off between storage space and performance.

JCR already includes full text searches and the ability to maintain user ratings. It should be a good fit for your project.

肤浅与狂妄 2024-08-25 19:11:06

有什么主要原因让我应该坚持使用 Java 吗?

并不真地。您可能已经知道,您可以将任何 Java 库与 Groovy/Grails 一起使用,因此在 Java 中没有什么是在 Groovy 中不能做的。尽管反之亦然,但根据我的经验,用 Java 完成工作需要更多(样板)代码。

尽管 Java 比 Groovy 快得多,但这并不一定意味着用 Java 编写的应用程序会更快,因为瓶颈可能是数据库而不是代码执行。

至于你应该使用 Lucene/Searchable 还是 JackRabbit,如果不知道你能实现什么,就很难说。到目前为止,您所告诉我们的只是您想要索引文档并提升索引中的某些项目。您当然可以使用 Lucene 来完成这两个任务。

is there any major reasons I should just stick to Java?

Not really. As you probably already know, you can use any Java library with Groovy/Grails, so there's nothing you can do in Java that you can't do in Groovy. Although the contrary is also true, in my experience, it takes a lot more (boilerplate) code to get things done in Java.

Although Java is considerable faster than Groovy, this doesn't necessarily mean your app will be faster if written in Java, as the bottleneck could likely be the database rather than code execution.

As for whether you should use Lucene/Searchable or JackRabbit, it's very difficult to say without knowing much about what you can achieve. All you've told us so far is that you want to index documents and boost certain items in the index. You can certainly do both of those with Lucene.

韶华倾负 2024-08-25 19:11:06

我建议在 Lucene 之上使用 JCR/Jackrabbit,原因如下:

1) 您的存储库结构可以轻松支持带有子节点的文档节点,这些子节点存储所有元数据,包括所有者、评级、标记、评论等。

2 ) JCR 非常适合基于文档/节点的应用程序开发,在框架级别提供大量繁重的工作,同时不会在应用程序级别妨碍您。

I would recommend using JCR/Jackrabbit on top of Lucene for a couple of reasons:

1) Your repository structure could readily support document nodes with child nodes that store all of your meta-data including owner, ratings, flagging, comments, etc.

2) JCR is ideal for document/node based app development, providing a lot of the heavy lifting at the framework level while not getting in your way at the app level.

儭儭莪哋寶赑 2024-08-25 19:11:06

我建议您使用 Apache Sling,它内置了 Jackrabbit/Lucene 。
大多数提交者也参与了 Jackrabbit,因此它被设计为能够与 Jackrabbit 很好地配合 - 更好的是,它被设计为在它之上运行。

Sling 的优点之一是它将整个 JCR 存储库安装在 URL 空间中并通过 REST 端点公开它。
因此,您可以通过执行简单的 HTTP 请求来非常轻松地访问您的文档/元数据。它还允许您编写自己的 servlet 并将它们公开为 REST 端点。 (这极其简单——无需摆弄applicationContext.xml文件,只需1个注释)

它还允许您编写jsp、esp、groovy等。

I would recommend you to use Apache Sling, it comes with Jackrabbit/Lucene built-in.
Most of the committers are also involved with Jackrabbit, so it's designed to work well with it -- even better, it's designed to run on top of it.

One of the nice features of Sling is that it mounts the entire JCR repository in the URL space and exposes it via REST endpoints.
So you can access your documents/metadata very easily by doing a simple HTTP request to it. It also allows you to write your own servlets and expose them as REST endpoints. (This is extremely easy -- no fiddling about with applicationContext.xml files, just 1 annotation)

It also allows you to write jsp, esp, groovy, ...

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文