在键值数据存储中存储目录层次结构

发布于 2024-08-08 18:39:20 字数 617 浏览 7 评论 0原文

在键值数据库(在我的例子中是 MongoDB,但其中任何一个)中存储目录层次结构/树的干净/高效的方法是什么?

例如,树结构

- Cars 
   + Audi 
   + BMW
      - M5
   + Ford
- Color
   + Red
      - Apple
      - Cherry
   + Purple
- Funny

我现在使用的方法,每个对象都链接到它的父对象

{ 
  dir: "red"
  parent-dir: "color"
}

这使得插入和重新排序树的任何方面都非常高效/快速(例如,如果我想将红色及其所有子对象移动到汽车)目录)。

但是当我想要递归地访问给定目录的所有子目录及其子目录时,这种方法很糟糕。为了提高解析效率,我可以有一个结构,例如

{ 
  dir: "red"
  children: "audi, bmw, ford"
}

{ 
  dir: "bmw"
  children: "m5"
}

但是如果我想修改树,则需要触摸和修改一大堆对象。

还有其他方法可以在 KV 存储中存储目录结构吗?

What is a clean/efficient method for storing the directory Hierarchy/tree in a Key-Value database (in my case MongoDB but any of them)?

For example a tree structure

- Cars 
   + Audi 
   + BMW
      - M5
   + Ford
- Color
   + Red
      - Apple
      - Cherry
   + Purple
- Funny

The method I am using now, each object links to it's parent

{ 
  dir: "red"
  parent-dir: "color"
}

This makes it very efficient/fast to insert and reorder any aspect of the tree (for example if I want to move Red and all it's children to the Cars directory).

But this method sucks when I want to all subdirectories and their children for a given directory recursively. To make it efficient to parse I can have a structure for example

{ 
  dir: "red"
  children: "audi, bmw, ford"
}

{ 
  dir: "bmw"
  children: "m5"
}

But if I want to modify the tree, a whole bunch of objects need to touched and modified.

Are there any other methods to storing a directory structure in a KV store?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(4

丶情人眼里出诗心の 2024-08-15 18:39:20

您当前使用的方法称为邻接列表模型

在(关系)数据库中存储分层数据的另一种模型是嵌套集模型。它在 SQL 数据库中的实现是众所周知的。另请参阅本文了解修改后的先序树遍历算法

一个非常简单的方法:您可以为每个对象存储一个路径 - 使用这些方法应该可以轻松查询 NOSQL 数据库中的树:

{ path: "Color", ... }
{ path: "Color.Red", ... }
{ path: "Color.Red.Apple", ... }
{ path: "Color.Red.Cherry", ... }

当节点将被删除或重命名时,必须更新某些路径。但总的来说,这种方法看起来很有前途。您只需保留一个特殊字符作为分隔符即可。存储空间开销应该可以忽略不计。

编辑:此方法称为 物化路径

最后,这是不同分层方法的比较NOSQL 数据库中的数据

The method you currently use now is called adjacency list model.

Another model to store hierarchical data in a (relational) database is the nested set model. Its implementation in SQL databases is well known. Also see this article for the modified preorder tree traversal algorithm.

A very simple method: you could store a path per object - with those it should be easy to query trees in NOSQL databases:

{ path: "Color", ... }
{ path: "Color.Red", ... }
{ path: "Color.Red.Apple", ... }
{ path: "Color.Red.Cherry", ... }

When nodes will be removed or renamed some paths must be updated. But in general, this method looks promising. You just have to reserve a special character as separator. The storage space overhead should be negligible.

edit: this method is called materialized path

Finally, here is a comparison of different methods for hierarchical data in NOSQL databases.

肤浅与狂妄 2024-08-15 18:39:20

我没有大量的 NOSQL 经验,所以这不是一个明确的答案,但我的方法如下:

我可能会使用你的第一种方法,你有:

{
  dir: 'dir_name',
  parent_dir: 'parent_dir_name'
}

然后设置一个映射缩减快速查询目录的子目录。 MongoDB 的 map-reduce 功能仍然只在开发分支中可用,我还没有使用过它,但是在 CouchDB 中(我假设在 MongoDB 中进行一些修改)你可以这样做

map:
function(doc) {
  emit( doc.parent_dir, doc.dir );
}

reduce:
function(key, values) {
  return( values );
}

:每个父目录的子目录列表。

I don't have a huge amount of NOSQL experience, so this isn't a definitive answer, but here's how I'd approach it:

I would likely use your first approach, where you have:

{
  dir: 'dir_name',
  parent_dir: 'parent_dir_name'
}

And then set up a map-reduce to quickly query the children of a directory. MongoDB's map-reduce functionality is still only available in the development branch and I haven't worked with it yet, but in CouchDB (and I assume, with a few modification, in MongoDB) you could do something like:

map:
function(doc) {
  emit( doc.parent_dir, doc.dir );
}

reduce:
function(key, values) {
  return( values );
}

Which would give you the list of sub-directories for each parent directory.

与君绝 2024-08-15 18:39:20

我建议将堆存储到数据项的 id 中。
我认为这是最好的计划。如果您需要很多很多的东西,任何堆元素都可以是另一个堆的索引。

例如

{ "id:xxx", "id:yyy", "sub-heap-id:zzz"....}

如果不清楚,请发表评论,当我回家吧。

I suggest storing a heap to the the id's of the data items.
I think this is the best plan. If you need lots and lots of stuff any heap element could be an index to another heap.

eg

{ "id:xxx", "id:yyy", "sub-heap-id:zzz"....}

If this is not clear post a comment and I will explain more when I get home.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文