是否可以将 RDF 存储也用作面向文档的数据库?
假设我有大量异构 JSON 文档(即命名键值映射)和这些文档所附加的类层次结构(即命名集)。我需要建立一个数据结构,允许:
- 对 JSON 文档进行 CRUD 操作。
- 通过 ID 快速检索 JSON 文档。非常快。
- 非常快检索附加到某个类的所有 JSON 文档。
- 编辑类层次结构:添加/删除类,重新排列它们。
我最初提出了将 JSON 文档存储在面向文档的数据库(如 CouchDB 或 MongoDB)中并将类层次结构存储在 RDF 存储(如 4store)中的想法。然后自然地计算出 1
、2
和 4
,并通过维护附加文档 ID 列表来解决 3
存储中的每个类别。
但后来我发现 RDF 存储实际上可以完成通过 ID 检索 JSON 文档的面向文档的部分。乍一看这似乎是正确的,但我仍然担心 2
和 3
。是否有一个RDF存储能够快速检索文档(节点)面向文档的数据库的服务文档?它提供类似于 3
的查询的速度有多快?我听说过一些关于 RDF 存储速度慢、具体化问题等的
说法。是否有一种 RDF 存储也可以像 CouchDB 一样方便地通过 ID 随意检索对象?使用面向文档的存储和 RDF 存储来存储、检索和编辑类似 JSON 的对象有什么区别?
Suppose I have a large ammount of heterogeneous JSON documents (i.e. named key-value mappings) and a hierarchy of classes (i.e. named sets) that these documents are attached to. I need to set up a data structure that will allow:
- CRUD operations on JSON documents.
- Retrieving JSON documents by ID really quickly.
- Retrieving all JSON documents that are attached to a certain class really quickly.
- Editing class hierarchy: adding/deleting classes, rearranging them.
I've initially came up with the idea of storing JSON documents in a document-oriented database (like CouchDB or MongoDB) and storing class hierarchy in an RDF storage (like 4store). 1
, 2
and 4
are then figured out naturally, and 3
solved by maintaining list of attached document IDs for every class in the storage.
But then I figured that a RDF storage could actually do the document-oriented part of retrieving JSON documents by ID. At a first glance this seems true, but I'm still concerned about 2
and 3
. Is there a RDF storage that is able to retrieve documents (nodes) at a speed document-oriented db's serve documents? How fast will it serve 3
-like queries? I've heard a little bit about RDF storages being slow, reification problem, etc.
Is there an RDF storage that is also as comfortable for casual retrieving objects by ID, as CouchDB, for example? What is the difference between using document-oriented and RDF storage for storing, retrieving and editing JSON-like objects?
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(2)
您最初针对图形数据库(例如 Neo4j)提出了这个问题。这就是为什么我想添加一些注释。
You originally asked this question for graph databases (like Neo4j). That's why I'd like to add some notes.
在 RDF 数据库中可以使用的最接近的东西是命名图。在命名图中,您可以放置一组 RDF 三元组。根据您的需要,可以从一个或多个 RDF 文档断言这组三元组。假设您希望每个 RDF 文档都有一个命名图。您可以使用反映文件位置的 URI(URL 或 IRI)来命名该图。例如...
或
4store 是四家商店。 Quad 存储支持命名图,4store 是专门为处理此问题而设计的。
使用 4store,您可以运行以下命令来断言命名图形中的三元组:
在
/data/
之后,您可以将 GRAPH 标识符 (IRI) 放在要断言三元组的位置。请参阅 4store sparql 服务器 和 4store 客户端库 了解更多详细信息。断言数据后,您还可以使用 SPARQL 使用命名图将查询定向到该图:
此外,4store 还支持 JSON,因此您可以直接在 JSON 中检索 SPARQL 结果集。
如果您决定使用 4store,您将在这里找到宝贵的支持:http://4store.org/contact
The closest thing you can use in RDF databases are named graphs. In a named graph, you can put a set of RDF triples. This set of triples can be asserted from one or many RDF documents depending on your needs. Lets say you want one named graph per RDF document. You could name the graph with a URI that reflects the file location a URL or a IRI. For instance ...
or
4store is a quad store. Quad stores support named graphs and 4store is specially design to handle this.
With 4store you can run the following command to assert triples in a Named Graph:
After
/data/
you can put the GRAPH identifier (IRI) where the triples are going to be asserted. See 4store sparql server and 4store Client Libs for more details.Once you have your data asserted, with SPARQL you can also use the named graph to direct your query to that graph:
Moreover, 4store also supports JSON so you can retrieve the SPARQL resultset directly in JSON.
If you decide to use 4store you'll find valuable support here: http://4store.org/contact