是否可以将 RDF 存储也用作面向文档的数据库?

发布于 2024-12-19 03:46:11 字数 710 浏览 1 评论 0原文

假设我有大量异构 JSON 文档(即命名键值映射)和这些文档所附加的类层次结构(即命名集)。我需要建立一个数据结构,允许:

  1. 对 JSON 文档进行 CRUD 操作。
  2. 通过 ID 快速检索 JSON 文档。非常快
  3. 非常快检索附加到某个类的所有 JSON 文档。
  4. 编辑类层次结构:添加/删除类,重新排列它们。

我最初提出了将 JSON 文档存储在面向文档的数据库(如 CouchDB 或 MongoDB)中并将类层次结构存储在 RDF 存储(如 4store)中的想法。然后自然地计算出 124,并通过维护附加文档 ID 列表来解决 3存储中的每个类别。

但后来我发现 RDF 存储实际上可以完成通过 ID 检索 JSON 文档的面向文档的部分。乍一看这似乎是正确的,但我仍然担心 23。是否有一个RDF存储能够快速检索文档(节点)面向文档的数据库的服务文档?它提供类似于 3 的查询的速度有多快?我听说过一些关于 RDF 存储速度慢、具体化问题等的

说法。是否有一种 RDF 存储也可以像 CouchDB 一样方便地通过 ID 随意检索对象?使用面向文档的存储和 RDF 存储来存储、检索和编辑类似 JSON 的对象有什么区别?

Suppose I have a large ammount of heterogeneous JSON documents (i.e. named key-value mappings) and a hierarchy of classes (i.e. named sets) that these documents are attached to. I need to set up a data structure that will allow:

  1. CRUD operations on JSON documents.
  2. Retrieving JSON documents by ID really quickly.
  3. Retrieving all JSON documents that are attached to a certain class really quickly.
  4. Editing class hierarchy: adding/deleting classes, rearranging them.

I've initially came up with the idea of storing JSON documents in a document-oriented database (like CouchDB or MongoDB) and storing class hierarchy in an RDF storage (like 4store). 1, 2 and 4 are then figured out naturally, and 3 solved by maintaining list of attached document IDs for every class in the storage.

But then I figured that a RDF storage could actually do the document-oriented part of retrieving JSON documents by ID. At a first glance this seems true, but I'm still concerned about 2 and 3. Is there a RDF storage that is able to retrieve documents (nodes) at a speed document-oriented db's serve documents? How fast will it serve 3-like queries? I've heard a little bit about RDF storages being slow, reification problem, etc.

Is there an RDF storage that is also as comfortable for casual retrieving objects by ID, as CouchDB, for example? What is the difference between using document-oriented and RDF storage for storing, retrieving and editing JSON-like objects?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(2

生活了然无味 2024-12-26 03:46:11

您最初针对图形数据库(例如 Neo4j)提出了这个问题。这就是为什么我想添加一些注释。

  1. 图形数据库对节点(和关系)使用集成的索引,因此可以快速初始查找文档的根节点是通过该路径完成的(外部或图形索引)
  2. 路径的附加图形索引(实际上是到根的树)可以更清晰地建模,而只是键值查找)
  3. 如果您将文档建模为树具有您的属性的节点可以执行任何简单和复杂的 CRUD 操作(也是结构性的)
  4. 检索“类型”或“类”的所有文档可以再次通过索引(将根节点索引到类型)来完成,或者在图形类别节点中,
  5. 您可以将这些“类型”或类”类别节点到层次结构(或图形)中,然后可以使用常用的图形数据库 API 进行编辑,
  6. 可以使用 遍历器 /集成图查询语言(例如 cypher for Neo4j
  7. 加载分层数据可以通过自定义导入程序或更多通用子图导入器(例如 GEOFF

You originally asked this question for graph databases (like Neo4j). That's why I'd like to add some notes.

  1. Graph databases use integrated indexing for nodes (and relationships) so the fast initial lookup for the root nodes of your documents is done via that (external or in graph indexes)
  2. Additional in graph indexes for paths (actually trees to the root) can be modelled cleaner that just a key-value lookup)
  3. If you model your documents as trees of nodes with properties you can do any simple, and complex CRUD operations (also structural)
  4. retrieving all documents of a "type" or "class" can again be done by a index (index root nodes to type) or in graph category nodes
  5. you can put those "types or class" category-nodes into a hierarchy (or graph) which then can be edited using the usual graph database API
  6. traversing the graph can be done using traversers / integrated graph query language (e.g. cypher for Neo4j)
  7. Loading hierarchical data can either be done by custom importers or a more general sub-graph importer (e.g. GEOFF)
药祭#氼 2024-12-26 03:46:11

在 RDF 数据库中可以使用的最接近的东西是命名图。在命名图中,您可以放置​​一组 RDF 三元组。根据您的需要,可以从一个或多个 RDF 文档断言这组三元组。假设您希望每个 RDF 文档都有一个命名图。您可以使用反映文件位置的 URI(URL 或 IRI)来命名该图。例如...

http://yourdomain/files/rdf_file_1

file:///home/myrdffiles/file1

4store 是四家商店。 Quad 存储支持命名图,4store 是专门为处理此问题而设计的。

使用 4store,您可以运行以下命令来断言命名图形中的三元组:

curl -T your_file.rdf http://your_4store_database/data/http://yourdomain/files/rdf_file_1

/data/ 之后,您可以将 GRAPH 标识符 (IRI) 放在要断言三元组的位置。请参阅 4store sparql 服务器4store 客户端库 了解更多详细信息。

断言数据后,您还可以使用 SPARQL 使用命名图将查询定向到该图:

SELECT * WHERE {
   GRAPH <http://youdomain/files/rdf_file_1> {
        .... some triple patterns in here ....
   }
}

此外,4store 还支持 JSON,因此您可以直接在 JSON 中检索 SPARQL 结果集。

如果您决定使用 4store,您将在这里找到宝贵的支持:http://4store.org/contact

The closest thing you can use in RDF databases are named graphs. In a named graph, you can put a set of RDF triples. This set of triples can be asserted from one or many RDF documents depending on your needs. Lets say you want one named graph per RDF document. You could name the graph with a URI that reflects the file location a URL or a IRI. For instance ...

http://yourdomain/files/rdf_file_1

or

file:///home/myrdffiles/file1

4store is a quad store. Quad stores support named graphs and 4store is specially design to handle this.

With 4store you can run the following command to assert triples in a Named Graph:

curl -T your_file.rdf http://your_4store_database/data/http://yourdomain/files/rdf_file_1

After /data/ you can put the GRAPH identifier (IRI) where the triples are going to be asserted. See 4store sparql server and 4store Client Libs for more details.

Once you have your data asserted, with SPARQL you can also use the named graph to direct your query to that graph:

SELECT * WHERE {
   GRAPH <http://youdomain/files/rdf_file_1> {
        .... some triple patterns in here ....
   }
}

Moreover, 4store also supports JSON so you can retrieve the SPARQL resultset directly in JSON.

If you decide to use 4store you'll find valuable support here: http://4store.org/contact

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文