在 Java 中表示线程注释的最有效的数据结构?

发布于 2024-07-17 00:55:40 字数 478 浏览 14 评论 0原文

我想用 Java 表示线程注释。 这看起来类似于在 reddit.com 上评论的方式,

hello
   hello
      hello
      hello
   hello
   hello
      hello

如上面的示例所示,响应嵌套在 HTML 中,并带有适当的缩进以反映它们与先前评论的关系。

在 Java 中表示这一点的有效方法是什么?

我认为某种树数据结构是合适的。

但是是否有一个特定的方法可以最有效来最小化树遍历?

如果我对每条评论进行投票,这将很重要。 因为每次投票后都需要对树进行重新排序——这在计算上可能是昂贵的操作。

顺便说一句,如果有人知道 Java 中现有的开源实现,那也会有所帮助。

I want to represent threaded comments in Java. This would look similar to the way comments are threaded on reddit.com

hello
   hello
      hello
      hello
   hello
   hello
      hello

As in the example above, responses are nested in the HTML with appropriate indentation to reflect their relationship to prior comments.

What would be an efficient way to represent this in Java?

I'm thinking some kind of tree data structure would be appropriate.

But is there one in particular which would be most efficient to minimize tree traversals?

This would be important if I have voting on each comment. Because then the tree would need to be reordered after each vote - a potentially expensive operation computationally.

By the way, if anyone knows of an open source existing implementation of this in Java, that would help too.

如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。

扫码二维码加入Web技术交流群

发布评论

需要 登录 才能够评论, 你可以免费 注册 一个本站的账号。

评论(3

你穿错了嫁妆 2024-07-24 00:55:40

我会使用链表的级别。

message1
    message2
        message3
        message4
    message5
    message6
        message7

每个节点都有一个指向其的指针:

- forward sibling  (2->5, 3->4, 5->6,                   1/4/6/7->NULL).
- backward sibling (4->3, 5->2, 6->5,                   1/2/3/7->NULL).
- first child      (1->2, 2->3, 6->7,                   3/4/5/7->NULL).
- parent           (2->1, 3->2, 4->2, 5->1, 6->1, 7->6,       1->NULL).

在每个级别中,消息将按投票数(或您想要使用的任何其他分数)在列表中排序。

这将为您提供移动事物的最大灵活性,并且您只需更改父级和该级别的链接即可移动整个子树(例如,message2)。

例如,message6 获得大量选票,使其比 message5 更受欢迎。 更改是(调整下一个和上一个同级指针):

  • message2 -> 消息6
  • 消息6 -> 消息5
  • 消息5 -> 空。

获得:

message1
    message2
        message3
        message4
    message6
        message7
    message5

如果继续进行直到获得的票数多于 message2,则会发生以下情况:

  • message6 -> 消息2
  • 消息2 -> message5

AND message1 的第一个子指针设置为 message6(它是 message2) ,仍然相对容易获得:

message1
    message6
        message7
    message2
        message3
        message4
    message5

仅当分数变化导致消息变得大于其上兄弟或小于其下兄弟时,才需要进行重新排序。 您无需在每次分数更改后重新排序。

I would use levels of linked lists.

message1
    message2
        message3
        message4
    message5
    message6
        message7

Each node would have a pointer to its:

- forward sibling  (2->5, 3->4, 5->6,                   1/4/6/7->NULL).
- backward sibling (4->3, 5->2, 6->5,                   1/2/3/7->NULL).
- first child      (1->2, 2->3, 6->7,                   3/4/5/7->NULL).
- parent           (2->1, 3->2, 4->2, 5->1, 6->1, 7->6,       1->NULL).

Within each level, messages would be sorted in the list by vote count (or whatever other score you wanted to use).

That would give you maximum flexibility for moving things around and you could move whole sub-trees (e.g., message2) just by changing the links at the parent and that level.

For example, say message6 gets a influx of votes that makes it more popular than message5. The changes are (adjusting both the next and previous sibling pointers):

  • message2 -> message6
  • message6 -> message5
  • message5 -> NULL.

to get:

message1
    message2
        message3
        message4
    message6
        message7
    message5

If it continues until it garners more votes than message2, the following occurs:

  • message6 -> message2
  • message2 -> message5

AND the first-child pointer of message1 is set to message6 (it was message2), still relatively easy, to get:

message1
    message6
        message7
    message2
        message3
        message4
    message5

Re-ordering only needs to occur when a score change results in a message becoming more than its upper sibling or less than its lower sibling. You don't need to re-order after every score change.

属性 2024-07-24 00:55:40

树是正确的(使用 getLastSibling 和 getNextSibling),但如果您要存储/查询数据,您可能希望存储每个条目的谱系,或通过前序遍历存储编号:

http://www.sitepoint.com/article/hierarchical-data-database/2/

对于丢失子节点的确切数量,您可以留出间隙以尽量减少重新编号。 不过,我不确定这是否会比每次遍历树要快得多。 我想这取决于你的树长得多深。

另请参阅:

SQL - 如何存储和导航层次结构?
http://www.ibase.ru/devinfo/DBMSTrees/sqltrees.html(该方案也称为 Celko 树)

The tree is right (with getLastSibling and getNextSibling), but if you're storing/querying the data, you probably want to store a lineage for each entry, or number by a preorder traversal:

http://www.sitepoint.com/article/hierarchical-data-database/2/

For loss of the exact number of subnodes, you can leave gaps to minimise renumbering. Still, I'm not certain that this will be noticeably faster than traversing the tree each time. I guess it depends how deep your tree grows.

See also:

SQL - How to store and navigate hierarchies?
http://www.ibase.ru/devinfo/DBMSTrees/sqltrees.html (this scheme is also call a Celko tree)

新人笑 2024-07-24 00:55:40

如果我对每条评论进行投票,这将很重要。 因为每次投票后都需要对树进行重新排序 - 这在计算上可能是昂贵的操作。

对我来说听起来像是一个过早的优化,甚至可能是一个错误的优化。

您的树数据结构对于表示您的数据来说听起来很合乎逻辑。 我说坚持下去。 仅当检测到和测量性能问题并可以与替代方案进行比较时才对其进行优化。

This would be important if I have voting on each comment. Because then the tree would need to be reordered after each vote - a potentially expensive operation computationally.

Sounds like a premature optimization to me, possibly even a faulty optimization.

Your tree data structure sounds logical for representing your data. I say stick with it. Optimize it later only if a performance problem is detected and measured, and can be compared with alternatives.

~没有更多了~
我们使用 Cookies 和其他技术来定制您的体验包括您的登录状态等。通过阅读我们的 隐私政策 了解更多相关信息。 单击 接受 或继续使用网站,即表示您同意使用 Cookies 和您的相关数据。
原文