您使用哪些语言、框架和技术来实现文档搜索?
我在一家新公司,我们的目标之一是为我们的团队和客户实施文档搜索门户。我有点担心,如果我们在云中使用外部服务提供商(如 Salesforce 或其他 ECM),将来会有大量集成工作。从客户的角度来看,这些文档也将与我们的结构化内容存在于同一个存储桶中(存储在数据库中,而不是 MS Word 文档中)。
如果您实现过文档搜索,您使用了哪些语言、框架和技术?你有失败的故事吗?我对使用开箱即用的东西没有问题,但我认为我们对文档和访问它们的 API 的控制非常重要。如果我们完全定制,我想使用 Rails。
I am at a new company and one of our goals is to implement a document search portal for our team and our clients. I am a bit worried that if we use an external service provider like Salesforce or some other ECM in the cloud there will be a lot of integration work in the future. From a client perspective, these documents will also exist in the same bucket as our structured content (stored in the DB, not a MS Word doc).
If you have implemented document searching, what languages, frameworks, and technologies have you used? Do you have any failure stories? I don't have a problem using something out of the box, but I think it is important that we have control over the documents and the API to access them. I would like to use Rails if we go fully custom.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
根据您的许可需求 Lucene (LGPL) 和 Xapian (GPL) 都是出色、成熟、快速的搜索引擎 API,可绑定多种语言。我使用它们都取得了巨大的成功。
Depending on your licensing needs Lucene (LGPL) and Xapian (GPL) both are great, mature, fast search engine API's with bindings for a lot of languages. I've used both of them with great success.
Lucene 可能是最安全的选择,因为它被广泛使用并且非常好。
从 Lucene 中受益的最简单方法可能是使用 Alfresco,它安装起来很简单,并且默认情况下具有 Lucene 。这意味着您只需安装 Alfresco,将文档放入存储库中,然后就可以使用强大的网络搜索界面来搜索文档。
如果您需要以编程方式进行搜索,我的建议是使用 Alfresco 的 CMIS 界面,它允许您以 REST 方式搜索。 JCR API 也可用。
Lucene is probably the safest choice because it is widely used and quite good.
The easiest way to benefit from Lucene is probably with Alfresco, which is a breeze to install, and has Lucene by default. It means you just need to install Alfresco, put your documents in the repository, and you can search for your documents using the powerful web search interface.
If you need to search programmatically, my recommendation is to use Alfresco' CMIS interface, which allows you to search in a REST way. The JCR API is also available.