寻找一个成熟的、可扩展的 GraphDB,使用 .NET 或 C++绑定
我对 GraphDB 的基本要求:
- 成熟(生产就绪)
- 本机 .NET 或 C++ 语言绑定
- 水平可扩展性:两者
- 自动数据冗余和分片
- 分布式图算法/查询执行
目前我取消了以下资格:
- InfiniteGraph:没有 C++ / .NET 语言绑定
- HyperGraphDB:没有 C++ / .NET 语言绑定
- Microsoft Trinity:不成熟
- Neo4j:不是分布式
I'我不确定以下各项的可扩展性:
- Sparsity DEX
- Franz Inc. AllegroGraph
- Sones GraphDB
我发现有关水平可扩展性功能的可用信息非常笼统。我想这是有充分理由的。
任何信息将不胜感激。
My basic requirements from a GraphDB:
- Mature (production-ready)
- Native .NET or C++ language binding
- Horizontal scalability: both
- Automated data redundancy and sharding
- Distributed graph algorithms / query execution
Currently I disqualified the following:
- InfiniteGraph: no C++ / .NET language binding
- HyperGraphDB: no C++ / .NET language binding
- Microsoft Trinity: Not mature
- Neo4j: not distributed
I'm not sure about the scalability of the following:
- Sparsity DEX
- Franz Inc. AllegroGraph
- Sones GraphDB
I found the available information about horizontal scalability capabilities quite general. I guess there are good reasons for this.
Any information would be appreciated.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(1)
不幸的是,您的基本要求已经扩展了当今对图表的一般理解 - 即使在学术界也是如此。没有列出的纯图数据库能够满足您的所有需求。意识到大型分布式但互连图的分布式图算法仍然是一个很大的研究问题。因此,对于您的应用程序来说,最好找到一个匹配良好的图形数据库、图形处理堆栈或 RDF 存储,并自行实现缺失的部分。
当您的应用程序主要是在线事务图形处理(OLTP)(读/写繁重)并且重点关注顶点时,您可以暂时放弃分布式算法,然后使用其中之一:
当它更多的是在线分析处理(OLAP)(主要是阅读),仍然关注顶点和分布,那么:
或者它更关注边缘,逻辑推理/模式匹配,并且您需要或更好地可以接受边缘级别的分布,例如语义网,然后使用这些 RDF-/Triple-/Quadstores 之一:
好的起点可能是 DEX 或 Neo4j:如果您正在为 C++ 寻找一个好的且非常快的 graphdb 内核,DEX 可能是最好的,但您必须自己实现很多网络和分发的东西。 Neo4j 具有很多分布性和容错性,但目前更多的是在顶点分片级别,其内核是 Java。有关实现分布式图算法的想法和灵感,可以看看 Golden Orb 和 Signal/Collect。
另一种方法可能是从 AllegroGraph 或 Stardog 开始。尤其是 AllegroGraph 在开始时可能会有点棘手,直到您接受他们的思维方式。 Stardog 还很年轻,而且是 Java,但速度很快并且已经相当成熟。
Unfortunately your basic requirements already extend todays general understanding of graphs - even in the academia. No listed pure graph database will be able to satisfy all your needs. Distributed graph algorithms which are aware of large distributed but interconnected graphs are still a big research issue. So for your application it might be best to find a well matching graph database, graph processing stack or RDF-Store and implement the missing parts on your own.
When your application is mostly Online Transactional Graph Processing (OLTP) (read/write heavy) with a focus on the vertices and you can resign on the distributed algorithms for a moment then use one of these:
When it is more Online Analytical Processing (OLAP) (mostly read) still with a focus on the vertices and distribution really matters then :
Or is its focus more on the edges, logical reasoning/pattern matching and you need or better can live with a distribution on an edge level like in the Semantic Web then use one of these RDF-/Triple-/Quadstores:
Good starting points might be DEX or Neo4j: If you're looking for a good and really fast graphdb kernel for C++ DEX might be best, but you would have to implement a lot of networking and distribution stuff on your own. Neo4j has a lot of distribution and fault tolerance, but at the moment more on a vertex sharding level and it's kernel is Java. For ideas and inspiration on implementing distributed graph algorithms perhaps take a look at Golden Orb and Signal/Collect.
An alternative approach might be starting with AllegroGraph or Stardog. Especially AllegroGraph might be a bit tricky in the beginning until you get adopted to their way of thinking. Stardog is still young and Java, but fast and already quite mature.