使用 neo4j 索引文件系统是否有意义
我正在开发一个基于 Java 的备份客户端,它扫描文件系统上的文件,并使用它找到的要备份的目录和文件名填充 Sqlite 数据库。使用 neo4j 代替 sqlite 有意义吗?对于这个应用程序来说,它的性能会更好并且更容易使用吗?我在想,因为文件系统是一棵树(或者图形,如果考虑符号链接),那么差距数据库可能合适? sqlite 数据库模式仅定义 2 个表,一张用于目录(完整路径和其他信息),一张用于文件(仅使用外键来包含目录表中的目录的名称),因此相对简单。
该应用程序需要索引数百万个文件,因此解决方案需要足够快。
I am working on a Java based backup client that scans for files on the file system and populates a Sqlite database with the directories and file names that it find to backup. Would it make sense to use neo4j instead of sqlite? Will it be more perfomant and easier to use for this application. I was thinking because a filesystem is a tree (or graph if you consider symbolic links), a gaph database may be suitable? The sqlite database schema defines only 2 tables, one for directories (full path and other info) and one for files (name only with foreign key to containing directory in directory table), so its relatively simple.
The application needs to index many millions of files so the solution needs to be fast.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
只要您基本上可以使用存储的文件系统路径上的字符串匹配来执行数据库操作,那么使用关系数据库就有意义。当数据模型变得更加复杂并且您实际上无法通过字符串匹配进行查询但需要遍历图形时,使用图形数据库将使这变得更加容易。
As long as you can perform the DB operations essentially using string matching on the stored file system paths, using a relational databases makes sense. The moment the data model gets more complex and you actually can't do your queries with string matching but need to traverse a graph, using a graph database will make this much easier.
据我了解,Neo4j 最早的用途之一就是作为 Neo4j 起源的 CMS 系统的一部分来做到这一点。
Lucene 是 Neo4j 的索引后端,允许您构建可能需要的任何索引。
您应该仔细阅读并直接询问他们。
As I understand it then one of the earliest uses of Neo4j were to do exactly this as a part of the CMS system Neo4j is originiated from.
Lucene, the indexing backend for Neo4j, will allow you to build any indexes you might need.
You should read up on that and ask them directly.
我正在考虑使用类似的解决方案来索引文件系统上的数据存储。关于上述查询的评论是正确的。
最坏情况查询的示例:
对于 sqlite:
对于neo4j:
Greetings, hj
I am considering a similar solution to index a data store on a filesystem. Remark about the queries above is right.
Examples of worst case queries:
For sqlite:
For neo4j:
Greetings, hj