数据库结构以及查询分层数据和数据树

发布于 2024-10-07 07:28:35 字数 1071 浏览 0 评论 0原文

可能的重复:
什么是将平面表解析为树的最有效/优雅的方法?

我发现这相当棘手,并且想了解有关此事的一些意见。 我正在尝试存储具有未知数量的级别和分支的层次结构数据(树状)。我希望能够随时添加新内容并删除任何内容。

由于用户群庞大,我需要能够从层次结构中的任何节点一次性高效地查询所有子 ID。

让我们假设一个网站的例子,家庭可以像在 Facebook 中一样进行社交和更新他们的状态,并且您可以随时查看家庭成员的“墙”,其中还包括层次结构中位于其下方的人员的所有最新状态更新按时间顺序。

显然,一旦您拥有作为该家庭成员节点子节点的家庭成员 id 数组,在循环中获取帖子就很容易了。

让我们举一个简单的表结构的例子:

id  |  parentId  |  name
________________________

1   |    NULL    |  John
2   |     1      |  Peter
3   |     1      |  Bob
4   |     3      |  Emma
5   |     2      |  Sam
6   |     4      |  Gill

等等......你明白了。

我需要能够用类似的东西来做上面的事情,除非你认为结构需要调整。

我已经阅读了 mySql 嵌套集模型。 这看起来非常繁琐,如果某些内容未正确更新并且会搞乱一切,则可能不可靠。

我习惯使用 php 和 mysql,但也阅读过一些有关 cassandra 和 thrift 的文章。不确定这是否会更容易?

Possible Duplicate:
What is the most efficient/elegant way to parse a flat table into a tree?

This I am finding rather tricky and would like some opinions on the matter.
I am trying to store hierarchal data (tree like) with an unknown number of levels and branches. I am wanting to be able to add new ones and delete any at any time.

I need to be able to query from any node in the hierarchy for all of the children id's in one go and efficiently due to large user base.

Lets take a hypothetical example of a website where families socialise and update their status like in facebook and at any time you can be viewing a family members "Wall" which will also include all of the recent status updates form the people below them in the hierarchy in chronological order.

Obviously the fetching posts once you have the array of family members id's who are children of this family members node is easy enough in a loop.

Lets take an example simple table structure of:

id  |  parentId  |  name
________________________

1   |    NULL    |  John
2   |     1      |  Peter
3   |     1      |  Bob
4   |     3      |  Emma
5   |     2      |  Sam
6   |     4      |  Gill

etc.... You get the idea.

I need to be able to do the above with something like this unless you think the structure needs to be adapted.

I have read up on mySql nested set model.
This seems very fiddly and could be unreliable if something was not to update correctly and would mess everything up.

I am used to using php and mysql but have been reading a bit on cassandra and thrift. Not sure if this would be easier?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

晨光如昨 2024-10-14 07:28:35

已经有一些好的方法比您提出的解决方案更简单。

这里有几个链接解释了如何做到这一点(我们自己使用它来解决与您描述的几乎相同的问题,并且效果很好)。

这使得插入/更新更复杂,但选择树结构的部分要快得多(仅使用一个查询)。它允许在一个查询中查找任何给定节点的所有子节点,并通过一个查询查找给定节点的所有祖先。

There are already good approaches out there which are more simple than the solution you propose.

Here are a couple of links which explain how to do it (we use this ourselves for much the same problem you describe and it works well).

This makes inserting/updating more complex, but selecting portions of the tree structure far faster (with only one query). It allows finding all children of any given node in one query, and finding all the ancestors of a given node with one query.

送你一个梦 2024-10-14 07:28:35

所以我想我已经提出了一个想法。

反对嵌套集合模型的原因是因为它似乎仍然不是最好的方法,也不会成为理想的性能解决方案。

我将介绍我一直在考虑的拟议解决方案。
这个概念意味着创建一个分层映射表来跟踪每个家庭成员/节点之间的所有关系。

它的工作方式是:

使用以下映射表结构:

id  |  fMemberId   |   parentid
=====================================
1   |      3       |       2  
2   |      4       |       3
3   |      4       |       2

1) 当新的家庭成员作为父母的孩子创建时,我们将获取父母的 id 并在家庭中创建一个新行成员表,其中父 ID 设置为将来的附加用途和功能。

2) 创建此行后,我们将使用新家庭成员的所有父 ID 创建新行。

一种快速的方法是从新家庭成员处获取父 ID,并对 map 表进行查询,以查找家庭成员 ID 与家庭成员 ID 相同的所有行。新家庭成员的父 ID,然后将后续父 ID 的数组存储在 php 中,以便与新家庭成员 ID 一起存储在 map 表中。 这样就只需要一个 sql 查询来获取所有父 ID 来添加它们,而不是根据节点数量进行大量查询

这意味着当我们查看家庭成员时通过帖子提要,我们只需在数据库中查询 map 表中的行即可获取当前家庭成员的所有孩子 ID,然后查询其他表中的帖子数据。

主要的权衡是此类系统所需的潜在存储量。
不过我相信读取速度会更快,因为没有条件 SQL 语句,并且以这种方式写入数据库也可能同样快。

我们可以通过使用 InnoDB 的集群 id 分配初始家族 id 索引并根据家族 id 创建一个带有“下一个家族成员 id”的新表来克服这个问题。

还有可靠性(如果一行不是)如果写成这样,添加它就很容易了。它可以避免仅仅为了创建成员而不断编辑行。

您对此有何看法?

到目前为止,我认为这似乎是一个好方法。想了很多才到达这里。我还相信它可能会随着时间的推移而得到改进,并且能够存储每个成员的 id 数组而不是全部。仍在努力解决这个问题!

So I think I have come up with an idea.

The reason I am against the nested set model is because it seems like it is still not the best way and is not going to be the ideal performance solution.

I am going to cover a proposed solution I have been thinking about.
The concept means creating an hierarchal map table to keep track of all the relationships between each family member/node.

The way it would work is:

Using map table structure of this:

id  |  fMemberId   |   parentid
=====================================
1   |      3       |       2  
2   |      4       |       3
3   |      4       |       2

1) As a new family member is created as a child of a parent we would take the parents id and create a new row in our family members table with the parent id set for future additional uses and functionality.

2) As this row is created we will create new rows with all of the parent id's for the new family member.

A quick way to do this would be to take the parent id from the new family member and do a query to the map table to find all the rows with the family member id the same as the new family members parent id and then store an array in php of the subsequent parent ids required for storing alongside the new family members id in the map table. This would then only require one sql query for grabbing all the parent id's for adding them rather than a number of queries based on the number of nodes

This would mean when we are viewing a family members feed of posts we would be able to query the db for simply the rows in the map table to get all the children id's of the current family member and subsequently query other tables for the post data.

The main trade off being the amount of potential storage required for this kind of system.
However I believe reading speed would be quicker as there is no conditional SQL statements and also maybe just as quick to write to db in this way.

We could overcome this by using InnoDB's cluster id's assigning an initial family id index and creating a new table with the "next family members id" based on the family id.

Also reliability, if a row wasn't written it would be easy enough to add it in. It prevents having to continually edit rows just to create a member.

What are your thoughts on this?

So far this seems to be a good way in my opinion. Took a lot of thinking to get to here. I also believe it could maybe be improved with time and being able to store arrays of id's per member rather than all of them. Still trying to work that one out!

大姐,你呐 2024-10-14 07:28:35

是的,您的解决方案称为传递闭包。我之前写过:

您还需要零长度路径,例如 2-2、3-3、4-4。

Yes, your solution is called a transitive closure. I have written about it before:

You also need the zero-length paths, e.g. 2-2, 3-3, 4-4.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文