在 Liferay 门户上使用 Alfresco 和 Jackrabbit CMS 的比较
我想知道您对在 Liferay 上使用这两个 CMS 的想法。我知道,jackrabbit 更像是一个框架和参考 JCR 实现。我对这种情况更感兴趣,当您拥有 Liferay portlet 并且您需要一个 CMS 存储库而不是 liferay 文档库时,因为您需要更多功能。
我关心的是:
从各种文档格式中提取元数据的级别(我看到两者都使用 Apache tika 解析器)
内容转换级别 - 例如处理不完全有效的 PDF (OCR)
开发人员扩展功能有多容易 (例如,在文档处理上实现各种操作)
这两种方法都需要花费很多时间,我必须选择一种并坚持使用它。
谢谢
I'd like to know your thoughts about using these two CMS on Liferay. I know, that jackrabbit is rather a framework and reference JCR implementation. I'm more interested in the situation, when you have Liferay portlet and you need a CMS repository other then the liferay Document Library, because you need more features.
What I am concerned about:
Level of Metadata Extraction from various document formats ( I see that both are using Apache tika parsers)
Level of Content Transformation - for instance dealing with not quite valid PDFs (OCR)
How easily can developer extend functionality (for instance implementing various actions on document processing)
It takes a lot of time to try both of them, I have to decide on one and stick with it.
Thank you
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
我从来没有用 Jackrabbit 做过任何严肃的事情,但用 Alfresco 做了很多项目。
由于 Alfresco 和 Liferay 之间正在共同努力,以便提供可靠且经过验证的集成方面,Alfresco 至少应该最大限度地减少两个应用程序之间的集成工作,并且可能为您的项目提供一个良好的起点。
从功能的角度来看,以下内容适用于 Alfresco:
正如您所指出的,Alfresco 使用 Tika 进行元数据提取。默认情况下,支持多种文档类型,并且添加您自己的自定义元数据提取器非常简单简单且文档齐全。
Alfresco 将在 Swift 项目(即将发布的版本)将发布。目前,pdfbox 和 OpenOffice 等工具支持内容转换,为内容转换提供了良好的可靠性。平均情况。
Alfresco 非常擅长为存储库提供扩展点:您可以将代码挂接到 特定内容类型的事件,在创建/更新时触发的文件夹上配置规则 /删除其内部内容等等
I never did anything serious with Jackrabbit, but did quite a lot of projects with Alfresco.
Since there's an ongoing joint effort between Alfresco and Liferay in order to provide a solid and validated integration, Alfresco should at least minimize the integration efforts between the two applications, and possibly have a good starting point for your project.
From the functional point of view, the following apply to Alfresco:
as you noted, Alfresco makes use of Tika for metadata extraction. By default a number of document types are supported, and adding your own custom metadata extractor is quite easy and well documented.
Alfresco will make use of Tika for transformations when project Swift (an upcoming version) will be released. As per now, tools like pdfbox and OpenOffice are sitting behind content transformations, which provide good reliability for the average case.
offering extension points for the repository is something Alfresco is quite good at: you can hook you code upon events on specific content types, configure rules on folders that get triggered upon creation/update/delete of their inner content and so on