在 Java 中表示线程注释的最有效的数据结构?
我想用 Java 表示线程注释。 这看起来类似于在 reddit.com 上评论的方式,
hello
hello
hello
hello
hello
hello
hello
如上面的示例所示,响应嵌套在 HTML 中,并带有适当的缩进以反映它们与先前评论的关系。
在 Java 中表示这一点的有效方法是什么?
我认为某种树数据结构是合适的。
但是是否有一个特定的方法可以最有效来最小化树遍历?
如果我对每条评论进行投票,这将很重要。 因为每次投票后都需要对树进行重新排序——这在计算上可能是昂贵的操作。
顺便说一句,如果有人知道 Java 中现有的开源实现,那也会有所帮助。
I want to represent threaded comments in Java. This would look similar to the way comments are threaded on reddit.com
hello
hello
hello
hello
hello
hello
hello
As in the example above, responses are nested in the HTML with appropriate indentation to reflect their relationship to prior comments.
What would be an efficient way to represent this in Java?
I'm thinking some kind of tree data structure would be appropriate.
But is there one in particular which would be most efficient to minimize tree traversals?
This would be important if I have voting on each comment. Because then the tree would need to be reordered after each vote - a potentially expensive operation computationally.
By the way, if anyone knows of an open source existing implementation of this in Java, that would help too.
如果你对这篇内容有疑问,欢迎到本站社区发帖提问 参与讨论,获取更多帮助,或者扫码二维码加入 Web 技术交流群。
绑定邮箱获取回复消息
由于您还没有绑定你的真实邮箱,如果其他用户或者作者回复了您的评论,将不能在第一时间通知您!
发布评论
评论(3)
我会使用链表的级别。
每个节点都有一个指向其的指针:
在每个级别中,消息将按投票数(或您想要使用的任何其他分数)在列表中排序。
这将为您提供移动事物的最大灵活性,并且您只需更改父级和该级别的链接即可移动整个子树(例如,
message2
)。例如,
message6
获得大量选票,使其比message5
更受欢迎。 更改是(调整下一个和上一个同级指针):message2 -> 消息6
消息6 -> 消息5
消息5 -> 空。
获得:
如果继续进行直到获得的票数多于
message2
,则会发生以下情况:message6 -> 消息2
消息2 -> message5
AND
message1
的第一个子指针设置为message6
(它是message2
) ,仍然相对容易获得:仅当分数变化导致消息变得大于其上兄弟或小于其下兄弟时,才需要进行重新排序。 您无需在每次分数更改后重新排序。
I would use levels of linked lists.
Each node would have a pointer to its:
Within each level, messages would be sorted in the list by vote count (or whatever other score you wanted to use).
That would give you maximum flexibility for moving things around and you could move whole sub-trees (e.g.,
message2
) just by changing the links at the parent and that level.For example, say
message6
gets a influx of votes that makes it more popular thanmessage5
. The changes are (adjusting both the next and previous sibling pointers):message2 -> message6
message6 -> message5
message5 -> NULL
.to get:
If it continues until it garners more votes than
message2
, the following occurs:message6 -> message2
message2 -> message5
AND the first-child pointer of
message1
is set tomessage6
(it wasmessage2
), still relatively easy, to get:Re-ordering only needs to occur when a score change results in a message becoming more than its upper sibling or less than its lower sibling. You don't need to re-order after every score change.
树是正确的(使用 getLastSibling 和 getNextSibling),但如果您要存储/查询数据,您可能希望存储每个条目的谱系,或通过前序遍历存储编号:
http://www.sitepoint.com/article/hierarchical-data-database/2/
对于丢失子节点的确切数量,您可以留出间隙以尽量减少重新编号。 不过,我不确定这是否会比每次遍历树要快得多。 我想这取决于你的树长得多深。
另请参阅:
SQL - 如何存储和导航层次结构?
http://www.ibase.ru/devinfo/DBMSTrees/sqltrees.html(该方案也称为 Celko 树)
The tree is right (with getLastSibling and getNextSibling), but if you're storing/querying the data, you probably want to store a lineage for each entry, or number by a preorder traversal:
http://www.sitepoint.com/article/hierarchical-data-database/2/
For loss of the exact number of subnodes, you can leave gaps to minimise renumbering. Still, I'm not certain that this will be noticeably faster than traversing the tree each time. I guess it depends how deep your tree grows.
See also:
SQL - How to store and navigate hierarchies?
http://www.ibase.ru/devinfo/DBMSTrees/sqltrees.html (this scheme is also call a Celko tree)
对我来说听起来像是一个过早的优化,甚至可能是一个错误的优化。
您的树数据结构对于表示您的数据来说听起来很合乎逻辑。 我说坚持下去。 仅当检测到和测量性能问题并可以与替代方案进行比较时才对其进行优化。
Sounds like a premature optimization to me, possibly even a faulty optimization.
Your tree data structure sounds logical for representing your data. I say stick with it. Optimize it later only if a performance problem is detected and measured, and can be compared with alternatives.