SQL 中的分层标记

发布于 2024-07-08 10:31:54 字数 539 浏览 6 评论 0原文

我有一个 PHP Web 应用程序,它使用 MySQL 数据库进行对象标记,其中我使用了作为 这个问题

我想实现一个标签层次结构,其中每个标签都可以有一个唯一的父标签。 然后,对父标签 T 的搜索将匹配 T 的所有后代(即 T、父标签是 T(T 的子代)、T 的孙子等)。

最简单的方法似乎是在标签表中添加一个 ParentID 字段,其中包含标签的父标签的 ID,如果标签没有父标签,则添加一些幻数。 然而,搜索后代需要重复对数据库进行全面搜索,以找到每个“一代”中的标签,这是我想避免的。

一种(大概)更快但不太标准化的方法是拥有一个包含每个标签的所有子代,甚至每个标签的所有后代的表。 然而,这存在数据库中数据不一致的风险(例如,一个标签是多个父标签的子标签)。

有没有一种好方法可以让查询快速找到后代,同时尽可能保持数据标准化?

I have a PHP web application which uses a MySQL database for object tagging, in which I've used the tag structure accepted as the answer to this SO question.

I'd like to implement a tag hierarchy, where each tag can have a unique parent tag. Searches for a parent tag T would then match all descendants of T (i.e. T, tags whos parent is T (children of T), grandchildren of T, etc.).

The easiest way of doing this seems to be to add a ParentID field to the tag table, which contains the ID of a tag's parent tag, or some magic number if the tag has no parent. Searching for descendants, however, then requires repeated full searches of the database to find the tags in each 'generation', which I'd like to avoid.

A (presumably) faster, but less normalised way of doing this would be to have a table containing all the children of each tag, or even all the descendants of each tag. This however runs the risk of inconsistent data in the database (e.g. a tag being the child of more than one parent).

Is there a good way to make queries to find descendants fast, while keeping the data as normalised as possible?

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(5

你与昨日 2024-07-15 10:31:54

我使用两列来实现它。 我在这里稍微简化了它,因为我必须将标签名称保留在单独的字段/表中,因为我必须将其本地化为不同的语言:

  • 标签
  • 路径

查看这些行,例如:

tag            path
---            ----
database       database/
mysql          database/mysql/
mysql4         database/mysql/mysql4/
mysql4-1       database/mysql/mysql4-1/
oracle         database/oracle/
sqlserver      database/sqlserver/
sqlserver2005  database/sqlserver/sqlserver2005/
sqlserver2005  database/sqlserver/sqlserver2008/

等。

使用 like 运算符,您可以轻松获取所有需要的标记行:

SELECT * FROM tags WHERE path LIKE 'database/%'

有一些实现细节,例如当您在层次结构中移动节点时,您也必须更改所有子节点等,但这并不难。

还要确保路径的长度足够长 - 在我的例子中,我没有使用路径的标签名称,而是使用另一个字段来确保路径不会太长。

I implemented it using two columns. I simplify it here a little, because I had to keep the tag name in a separate field/table because I had to localize it for different languages:

  • tag
  • path

Look at these rows for example:

tag            path
---            ----
database       database/
mysql          database/mysql/
mysql4         database/mysql/mysql4/
mysql4-1       database/mysql/mysql4-1/
oracle         database/oracle/
sqlserver      database/sqlserver/
sqlserver2005  database/sqlserver/sqlserver2005/
sqlserver2005  database/sqlserver/sqlserver2008/

etc.

Using the like operator on the path field you can easily get all needed tag rows:

SELECT * FROM tags WHERE path LIKE 'database/%'

There are some implementation details like when you move a node in the hierarchy you have to change all children too etc., but it's not hard.

Also make sure that the length of your path is long enough - in my case I used not the tag name for the path, but another field to make sure that I don't get too long paths.

诗化ㄋ丶相逢 2024-07-15 10:31:54

Ali's answer has a link to Joe Celko's Trees and Hierarchies in SQL for Smarties, which confirms my suspicion - there isn't a simple database structure that offers the best of all worlds. The best for my purpose seems to be the "Frequent Insertion Tree" detailed in this book, which is like the "Nested Set Model" of Ali's link, but with non-consecutive indexing. This allows O(1) insertion (a la unstructured BASIC line numbering), with occasional index reorganisation as and when needed.

奢欲 2024-07-15 10:31:54

您可以构建 Kimball 所说的层次结构辅助表。

假设您的层次结构如下所示:A -> 乙| B-> C | C-> D

你会将记录插入到一​​个看起来像这样的表中,

ParentID, ChildID, Depth, Highest Flag, Lowest Flag
A, A, 0, Y, N
A, B, 1, N, N
A, C, 2, N, N
A, D, 3, N, Y
B, B, 0, N, N
B, C, 1, N, N
B, D, 2, N, Y
C, C, 0, N, N
C, D, 1, N, Y
D, D, 0. N, Y

我想我的想法是正确的......无论如何。 关键是您仍然正确存储层次结构,您只需从正确的表构建此表即可。 这个表的查询就像 Banshee 一样。 假设您想知道 B 以下的所有第一级是什么。

WHERE parentID = 'B' and Depth = 1

You could build what Kimball calls a Hierarchy Helper Table.

Say you hierarchy looks like this: A -> B | B -> C | C -> D

you'd insert records into a table that looks like this

ParentID, ChildID, Depth, Highest Flag, Lowest Flag
A, A, 0, Y, N
A, B, 1, N, N
A, C, 2, N, N
A, D, 3, N, Y
B, B, 0, N, N
B, C, 1, N, N
B, D, 2, N, Y
C, C, 0, N, N
C, D, 1, N, Y
D, D, 0. N, Y

I think I have that correct.... anyways. The point is you still store you hierarchy correctly, you just build this table FROM your proper table. THIS table queries like a Banshee. Say you want to know what all the first level below B are.

WHERE parentID = 'B' and Depth = 1
电影里的梦 2024-07-15 10:31:54

我会使用某种数组来存储子标签,这应该比加入表本身要快得多(特别是如果您有大量标签)。 我看了一下,我无法判断 mysql 是否具有本机数组数据类型,但您可以通过使用文本列并在其中存储序列化数组来模拟它。 如果您想进一步加快速度,您应该能够在该列上放置文本搜索索引以找出哪些标签相关。

[编辑]
读完 Ali 的文章后,我又做了一些搜索,发现了这个演示文稿在 postgres 中实现层次结构的方法。 对于解释目的可能仍然有帮助。

I would use some kind of array to store the children tags, this should be a lot faster than joining a table on itself (especially if you have a large number of tags). I had a look, and I can't tell if mysql has a native array data type, but you can emulate this by using a text column and storing a serialized array in it. If you want to speed things up further, you should be able to put a text search index on that column to find out which tags are related.

[Edit]
After reading Ali's article, I did some more hunting and found this presentation on a bunch of approaches for implementing hierarchies in postgres. Might still be helpful for explanatory purposes.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文