带有分数分解的评论线程
我正在用头撞东西,我想知道是否有比我更熟练的人可以帮助我。
我的目标是创建一个评论线程,将评论评分系统考虑在内。
首先我会解释一下我现在在哪里。
假设我们有一篇文章的评论线程,如下例所示。括号中的数字是该评论的 ID。 ID 由数据库自动分配,并随着每条附加评论的发布而按时间顺序递增。评论文本前的破折号数量代表评论深度。
(01)"This is a top level comment."
(02)-"This is a second level comment. A reply to the top level comment above."
(06)-"This is also a second level comment / another reply to comment 01."
(07)--"This is a reply to comment 06."
(03)"This is a different top level comment."
(05)-"This is a reply to the comment above."
(08)--"This is a reply to that comment in turn."
(10)---"This is a deeper comment still."
(04)"This is one more top level comment."
(09)-"This is one more reply."
我的第一个问题是以某种方式存储这些数据,这意味着它可以按正确的顺序返回。如果您只是存储深度字段并按深度排序,它将首先带回所有顶级注释,然后返回第二级注释等。这是不对的,我们必须返回完整出身完整的注释。
实现此目的的一种方法是存储每个评论的完整出身。
Comment ID | Parentage
01 | (Comment 01 has no parent because it is top level)
02 | 01- (Comment 02 was a reply to comment 01)
03 |
04 |
05 | 03-
06 | 01-
07 | 01-06- (Comment 07 has two ancestors 01 and then 06)
08 | 03-05-
09 | 04-
10 | 03-05-08-
添加另一个评论记录就像从您正在回复的评论中获取出身关系一样简单,然后附加其 ID 以形成新的出身关系。例如,如果我回复评论 10,我会获取其出身 (03-05-08-) 并附加其 ID (10-)。数据库会自动将其识别为第 11 条评论,我们会得到:
Comment ID | Parentage
01 |
02 | 01-
03 |
04 |
05 | 03-
06 | 01-
07 | 01-06-
08 | 03-05-
09 | 04-
10 | 03-05-08-
11 | 03-05-08-10-
现在,当我们订购要显示的评论时,我们对 Parentage 和 Comment ID 的串联进行排序,这给了我们:
Order by CONCAT(Parentage, ID)
Comment ID | Parentage | CONCAT(Parentage, ID)
01 | | 01-
02 | 01- | 01-02-
06 | 01- | 01-06-
07 | 01-06- | 01-06-07-
03 | | 03-
05 | 03- | 03-05-
08 | 03-05- | 03-05-08-
10 | 03-05-08- | 03-05-08-10-
11 | 03-05-08-10- | 03-05-08-10-11-
04 | | 04-
09 | 04- | 04-09-
这会产生与第一次演示完全相同的列表。我们后来添加的注释 11 插入到正确的位置:
(01)"This is a top level comment."
(02)-"This is a reply to the top level comment."
(06)-"This is another reply that was posted later than the first."
(07)--"This is a reply to the second level comment directly above."
(03)"This is a different top level comment."
(05)-"This is a reply to the comment above."
(08)--"This is a reply to the comment above."
(10)---"This is a deeper comment still."
(11)----"THIS COMMENT WAS ADDED IN THE EARLIER EXAMPLE."
(04)"This is one more top level comment."
(09)-"This is one more reply."
可以通过检查 CONCAT 字符串的长度并将 len(CONCAT(Parentage, ID)) 乘以设定的像素数来完成缩进。这太棒了,我们有一个以识别其出身的方式存储评论的系统。
现在的问题是:
并非所有评论都是平等的。需要一个评论评分系统来区分好的评论。假设每条评论都有一个点赞按钮。虽然我们希望保留出身,但如果一条评论有两个同一级别的直接回复,那么我们希望首先显示点赞最多的评论。我将在下面的[方括号]中添加一些投票。
(01)"This is a top level comment." [6 votes]
(02)-"This is a reply to the top level comment." [2 votes]
(06)-"This is another reply that was posted later than the first." [30 votes]
(07)--"This is a reply to the second level comment directly above." [5 votes]
(03)"This is a different top level comment." [50 votes]
(05)-"This is a reply to the comment above." [4 votes]
(08)--"This is a reply to the comment above." [0 votes]
(10)---"This is a deeper comment still." [0 votes]
(11)----"THIS COMMENT WAS ADDED IN THE EARLIER EXAMPLE." [0 votes]
(04)"This is one more top level comment." [2 votes]
(09)-"This is one more reply." [0 votes]
在此示例中,评论 (01) 和 (03) 都是顶级评论,但 (03) 有 [50 票] 而 (01) 只有 [6 票]。 (01) 出现在上面只是因为它发布较早,因此被分配了一个较小的 ID。同样,(02) 和 (06) 都是对 (01) 的回复,但必须重新排序以使得票最多的 (06) 上升到顶部。
我完全、完全地致力于实现这一目标。
我想任何排序/重新排序和索引都最好在评论投票上完成,而不是在页面加载上完成,这样页面加载时间可以尽可能快,但除此之外我完全不知道!
您可以就可能的途径提出的任何想法或启示都会真正减轻负担!感谢您一如既往的帮助。
-------------------------------------------------- ------------------------------
编辑:响应@Paddy的解决方案,
当我在下面运行@Paddy提供的表达式时模拟数据时,我得到的第一个错误是:
"The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP or FOR XML is also specified."
这可以通过将 SELECT 'top 100%' 添加到递归成员定义来解决。完成此操作后,我收到错误:
'CommentTree' has more columns than were specified in the column list.
这可以通过在 CommentTree 规范中添加“级别”列来解决。然后打印数据,但它首先返回所有顶级注释,然后返回类似于(但实际上不匹配)正确排序顺序的内容。
数据是这样返回的:
ParentId | CommentId | Comment | Vote | Level
NULL | 1 | Text here | 6 | 0
NULL | 3 | Text here | 50 | 0
NULL | 4 | Text here | 2 | 0
4 | 9 | Text here | 0 | 1
3 | 5 | Text here | 4 | 1
5 | 8 | Text here | 0 | 2
8 | 10 | Text here | 0 | 3
10 | 11 | Text here | 0 | 4
1 | 2 | Text here | 2 | 1
1 | 6 | Text here | 30 | 1
6 | 7 | Text here | 5 | 2
我做错了什么或者@Paddy 错过了什么吗?请接受我的歉意,递归函数对我来说很新。
I am banging my head against something and I was wondering if somebody more skilled that me could help me out.
My aim is to create a comment thread that factors in a system of comment scoring.
First I'll explain where I am currently.
Say we have a comment thread on an article that looks like the example below. The number in parenthasis is the ID of that comment. ID's are assigned automatically by the database and increment chronologically with each additional comment posted. The number of dashes before the comment text reperesent the comment depth.
(01)"This is a top level comment."
(02)-"This is a second level comment. A reply to the top level comment above."
(06)-"This is also a second level comment / another reply to comment 01."
(07)--"This is a reply to comment 06."
(03)"This is a different top level comment."
(05)-"This is a reply to the comment above."
(08)--"This is a reply to that comment in turn."
(10)---"This is a deeper comment still."
(04)"This is one more top level comment."
(09)-"This is one more reply."
My first problem was storing this data in a way that means it can be returned in the correct order. If you simply store a depth field and order by depth, it'll bring back all of the top level comments first and then the second level comments etc. This isn't right, we must return the comments with the full parentage still intact.
One way to achieve this is to store the full parentage for each comment.
Comment ID | Parentage
01 | (Comment 01 has no parent because it is top level)
02 | 01- (Comment 02 was a reply to comment 01)
03 |
04 |
05 | 03-
06 | 01-
07 | 01-06- (Comment 07 has two ancestors 01 and then 06)
08 | 03-05-
09 | 04-
10 | 03-05-08-
Adding another comment record is as simple as grabbing the Parentage from the comment that you are replying to, and appending its ID to form the new parentage. For example, if I was replying to comment 10, I would take it's parentage (03-05-08-) and append its ID (10-). The database would automatically recognise it as the 11th comment, and we'd get:
Comment ID | Parentage
01 |
02 | 01-
03 |
04 |
05 | 03-
06 | 01-
07 | 01-06-
08 | 03-05-
09 | 04-
10 | 03-05-08-
11 | 03-05-08-10-
Now, when we order the comments for display, we order on a concatenation of Parentage and Comment ID which gives us:
Order by CONCAT(Parentage, ID)
Comment ID | Parentage | CONCAT(Parentage, ID)
01 | | 01-
02 | 01- | 01-02-
06 | 01- | 01-06-
07 | 01-06- | 01-06-07-
03 | | 03-
05 | 03- | 03-05-
08 | 03-05- | 03-05-08-
10 | 03-05-08- | 03-05-08-10-
11 | 03-05-08-10- | 03-05-08-10-11-
04 | | 04-
09 | 04- | 04-09-
Which produces the exact same list as first demonstrated. With Comment 11 which we later added inserted in the correct place:
(01)"This is a top level comment."
(02)-"This is a reply to the top level comment."
(06)-"This is another reply that was posted later than the first."
(07)--"This is a reply to the second level comment directly above."
(03)"This is a different top level comment."
(05)-"This is a reply to the comment above."
(08)--"This is a reply to the comment above."
(10)---"This is a deeper comment still."
(11)----"THIS COMMENT WAS ADDED IN THE EARLIER EXAMPLE."
(04)"This is one more top level comment."
(09)-"This is one more reply."
Indenting can be done by checking the length of the CONCAT string and multiplying the len(CONCAT(Parentage, ID)) by a set number of pixels. This is great, we have a system of storing comments in a way that recognises their parentage.
Now the problem:
Not all comments are equal. A system of comment scoring is needed to distinguish good comments. Let's say each comment has an upvote button.. while we want to retain parentage, if one comment has two direct replies at the same level then we want to show the one with the most upvotes first. I'll add some votes in [square brackets] below.
(01)"This is a top level comment." [6 votes]
(02)-"This is a reply to the top level comment." [2 votes]
(06)-"This is another reply that was posted later than the first." [30 votes]
(07)--"This is a reply to the second level comment directly above." [5 votes]
(03)"This is a different top level comment." [50 votes]
(05)-"This is a reply to the comment above." [4 votes]
(08)--"This is a reply to the comment above." [0 votes]
(10)---"This is a deeper comment still." [0 votes]
(11)----"THIS COMMENT WAS ADDED IN THE EARLIER EXAMPLE." [0 votes]
(04)"This is one more top level comment." [2 votes]
(09)-"This is one more reply." [0 votes]
In this example, comments (01) and (03) are both top-level but (03) has [50 votes] and (01) only has [6 votes]. (01) appears above only by virtue of the fact that it was posted earlier and therefore has been assigned a smaller ID. Likewise (02) and (06) are both replies to (01) but must be reordered to allow the one with the most votes (06) to rise to the top.
I am completely and utterly stuck in trying to achieve this.
I imagine that any ordering/reordering and indexing would be better done on a comment-vote being cast rather than on page load so that the page-load time can be as quick as possible but beyond that I have absolutely no idea!
Any ideas or light you could shed on possible avenues would really take a load off! Thanks for your help as always.
--------------------------------------------------------------------------------
Edit: In response to @Paddy's solution,
When I run the expression offered by @Paddy below on the mock data, the first error I get is:
"The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP or FOR XML is also specified."
This can be remedied by adding SELECT 'top 100 percent' to the recursive member definition. Once this is done, I get the error:
'CommentTree' has more columns than were specified in the column list.
This can be resolved by adding a 'Level' column to the CommentTree specification. This then prints the data, but it returns all the top level comments first and then something resembling (but not actually matching) the correct sort order after.
The data is returned as such:
ParentId | CommentId | Comment | Vote | Level
NULL | 1 | Text here | 6 | 0
NULL | 3 | Text here | 50 | 0
NULL | 4 | Text here | 2 | 0
4 | 9 | Text here | 0 | 1
3 | 5 | Text here | 4 | 1
5 | 8 | Text here | 0 | 2
8 | 10 | Text here | 0 | 3
10 | 11 | Text here | 0 | 4
1 | 2 | Text here | 2 | 1
1 | 6 | Text here | 30 | 1
6 | 7 | Text here | 5 | 2
Have I done anything wrong or did @Paddy miss out something perhaps? Please accept my apologies, recursive functions are very new to me.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
下面的代码看起来很适合您的任务。它有点复杂,但在单个
SELECT
中完成它对我来说是一个挑战。您可以将其拆分为多个 SELECT 并预取到临时表中(出于性能目的),或者将其保留在一起。谢谢你的问题,这很有趣!
请注意,根节点的
ParentID
必须为0
,而不是NULL
。在此查询中,我执行以下操作:
SELECT
来计算所有级别的后代。position = 1
position =parent_position + 1
position = prev_sibling_position + prev_sibling_number_of_descendants
编辑 相同的解决方案,但没有 CTE。
Code below looks good for your task. It's a bit complex, but it was a challenge for me to make it in a single
SELECT
. You can split it into severalSELECT
with prefetch into temporary tables (for performance purposes), or keep it together.Thank you for the question, it was interesting!
Please note that
ParentID
for root nodes must be0
, notNULL
.In this query I do following:
SELECT
.position = 1
position = parent_position + 1
position = prev_sibling_position + prev_sibling_number_of_descendants
EDIT The same solution, but without CTE.
虽然与您的问题没有直接关系,但我的建议是更改为 嵌套集模型。我知道这需要大量返工,但迟早你会意识到这是最好的选择:)
Although not directly related to your question, my advice would be to change to the Nested Set Model. I know it is a lot of rework but sooner or later you'll realise it is the best choice :)
使用类似这样的表定义(自引用键):
然后您可以使用递归公用表表达式(这是在 MS SQL 中)来获取结果:
CTE 参考:
http://msdn.microsoft.com/en-us/library/ms186243.aspx
Using a table definition something like this (self referencing key):
You could then use a recursive common table expression (this is in MS SQL), to get your results:
CTE reference:
http://msdn.microsoft.com/en-us/library/ms186243.aspx